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PREFACE 


The  purpose  of  IDA  Memorandum  M-496,  Bibliography  of  Testing  and  Evaluation  Refer¬ 
ence  Material,  is  to  present  the  reference  material  acquired  in  the  course  of  developing  IDA 
Paper  P-2132,  SDS  Testing  and  Evaluation:  A  Review  of  the  State-of-the-Art  in  Software  Testing 
and  Evaluation  With  Recommended  R&D  Tasks.  This  document  was  prepared  for  the  Stra¬ 
tegic  Defense  Initiative  Organization  (SDIO). 


1.  SUBJECT  INDEX 

A1  Certification,  see  also  Security  Verification 
structuring  a  system  for,  [Good84a] 

ADAMAT,  [Kell85a],  [Kell85b],  [Perk87] 
analysis  and  validation  of  metrics,  [Perk86] 

AFFIRM,  [Gerh80],  [Thom81] 
applications  of, 

specification/verification  of  communication  protocols,  [Suns77],  [Suns82] 
comparison  with  other  techniques,  [Cheh81],  [Mill81b] 
example  of  model  formulation  and  analyzer  for  PSL/PSA,  [Gerh84] 
status  and  future  directions,  [Kemm86] 
underlying  formalisms,  [Muss79],  [Suns77],  [Suns82] 

ANNA:  A  Language  for  Annotating  Ada  Programs,  [Krie80],  [Luck84a],  [Luck84b] 
automated  support  for,  [Luck85] 
consistency  checking  in  Ada  and  ANNA,  [Krie83] 
implementation  of  a  subset,  [Sank85] 
transformation  approach,  [Sank85],  [Sank86] 
uses  of, 

comparative  testing,  [Luck85] 

self-checking  Ada  programs,  [Luck85] 

semantic  specification  of  Ada  packages,  [vonH85] 

ASSET:  A  System  to  Select  and  Evaluate  Tests,  [Fran85a],  [Fran85b] 
example  script,  [Fran88] 

ATTEST, 

AID:  ATTEST  Interface  Description  Language,  [Wint78] 
constraint  management  in,  [Dill81] 

Abstract  Data  Types , 
applications  of, 
formal  verification,  [Flon77] 
functional  testing,  [Boug85a],  [Boug86],  [Choq86] 
performance  analysis,  [Boot80] 
programming,  [Gutt75] 

prototypes  and  implementation  models,  [Belk86] 
to  simplify  modifications,  [Lind76] 
automated  support  for,  see  DAISTS,  [Gann80] 
specification  (and  verification)  of,  [Flon77] 
access-right  expressions  for  sequential  constraints,  [Kieb83] 
algebraic  techniques,  [Gogu78],  [Gutt77],  [Gutt78b],  [Zill74] 
automated  support  for,  AFFIRM,  DAISTS 
logic  programming,  [Boug86] 

proof  of  correctness  of  implementations  of,  [Gutt78a] 
axiomatic  techniques, 

applied  to  modeling  specification  languages,  [  jerh84] 
testing  completeness  of, 
automated  support  for,  [Jalo89] 
comparison  of  specification  formalisms,  [Emde81] 
evaluation  of  specification  techniques,  [Lisk75] 
hierarchical  specification,  [Wirs83] 
using  state  machines,  [Shan82] 
survey  of  techniques,  [Shan82] 
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Acceptance  Testing, 

debugging  techniques,  [Laue79] 
evaluating  readiness  for,  [Bowe79] 
example  of, 

for  Ada  compilers,  [Amor89] 
management  using  coverage  measures,  [Basi84a] 
methods,  [CSC78],  [Cele81] 

based  on  internal  program  state  behavior  data,  [Prot88] 
screening  criteria  for  military  software,  [Pari76] 
supported  by  reliability  measurement,  [Thom80] 

Ada, 

acceptance  testing, 
example  for  compilers,  [Amor89] 
animation  of  programs,  [Feld89] 
automated  support  for,  Development  Environments 
interactive  Ada,  [Stan83] 
progr amming-in-the-large ,  [Wolf85a] 
run-time  environments,  problems  with,  [Bend89] 
using  DIANA  trees  as  an  internal  form,  [Rose84],  [Rose85a] 
communication  protocols. 

Remote  Entry  Call/Remote  Procedure  Call,  [Schu81] 
compiler  validation,  [Will89] 
debugging,  [ContSS] 

AdaTAD,  [Fain85],  [Fain86],  [Lind88c] 

VIPS:  Visual  and  Interactive  Programming  Support,  [Isod87] 
annotation  based  see  also  ANNA,  TSL 
capabilities  needed,  [Brin85] 
graphical  debugger,  [Mora85] 
graphics-oriented  animation  tool,  [Feld89] 
knowledge-based  see  also  STAD, 
reconstructing  execution  host/target,  [Tayl82b] 
saving  traces  for,  [LeDo85] 
symbolic  debugger,  [DiMa85],  [Maug85],  [Maye89] 
definition  of,  [HONE80] 

AVA:  Annotated  Verifiable  Ada,  [Smit88] 

EEC  formal  definition,  [Mayf86] 

axiomatic  semantics  for  exceptions,  [Luck80b] 

definitions  of  modularity  and  their  application,  [Katz87] 

denotational  semantics,  [Mear83] 

description  of  language  using  Petri  Nets,  [Mand85] 

problems  with  task  semantics,  [Germ84] 

virtual  machine  for,  [Grov80] 

experiments  in,  [Agre86],  [Basi82b],  [Basi84d],  [Godf87] 
impact  on  reliability,  [Goel88] 
lessons  learned,  [Basi85hj,  [Brop87j 
fault-tolerance, 

correspondent  computing,  [Lee89a] 
formal  verification  of,  [Luck80a],  [Luck80b],  [McGe82j 
current  issues,  [Mayf85],  [Mayf86],  [Roby85] 
proof  system  for  tasks,  [Barr82],  [Gert84],  [Mear81] 
using  symbolic  execution,  [Dill87],  [Dill88a],  [Dill88b],  [Harr88c] 
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specification  and  proving  for  exceptions,  [Luck80c] 
interface  control  see  also  AdaPIC  Toolset, 
measurement  of. 

Partial  Metrics  System  for  metric-driven  design,  [Reyn86],  [Reyn87],  [Reyn89] 
automated  support  for, 
metric  analysis,  experience  with,  [Ande88] 
characterization  of  an  Ada  development,  [Basi85h] 
metrics  for,  [Basi83a],  [Gann83],  [Gann85],  [Gann86] 

Cyclomatic  Complexity, 

ACE  metric  and  tool,  [Taus87a],  [Taus87b],  [Taus88] 

Software  Science  for  designs,  [Szul84] 
analysis  and  validation  of,  [Perk86] 
automated  support  for  see  also  ADAMAT, 
classification  of,  [Basi84c] 
example  of  WIS  metrics,  [Dela88] 
structure  and  maintainability,  [Katz86] 
complexity,  [Leac87] 
software  structure, 

based  on  DQL:  DIANA  Query  Language,  [Bym89] 
performance  evaluation, 
of  compilers,  [Shaw89] 

benchmarks  for  real-time  system  compilers,  [Goel89] 
of  programs,  [Stan83] 
of  run-time  environments,  [Bend89] 
performance  analyzer,  [Maye89] 
supported  by  model  simulation,  [Lee89b] 
programmer  errors,  [Good86b] 
quality  assurance, 

planning  for  Ada  development  with  2167,  [Bark89] 
reproducible  testing,  [Tai85b] 
reuse, 

Ada  Software  Repository,  [Conn87] 
metric  analysis  of,  [Leac89] 

Moorehouse  object-oriented  reuse  library,  [Jone89] 

RLF:  Reusability  Library  Framework  project, 
domain  modeling,  [Sold89] 
hypertext  for  taxonomies  of  packages,  [Lato89] 
reusability  analysis  and  measurement,  [Romb88g],  [Romb88h] 
run-time  environments, 
evaluation  and  selection  of,  [Lefk89] 
run-time  monitoring,  [Helm84b] 

ART:  relational  translator  and  interpreter,  [DiMa85] 

ATEST:  Ada  Test  and  Analysis  Tools,  [Maye89] 
based  on  TSL  specifications,  [Helm84a],  [Helm85] 
detection  of  errors  and  evasive  actions,  [Helm83] 

transformation  and  monitoring,  [Germ82a],  [Germ82b],  [Germ84],  [Helm83] 
statement  probes,  [Prob82c] 
simulation, 

TASKIT:  Tasking  Ada  Simulation  Kit),  [Ange89] 
specification  languages  see  also  SADMT,  ANNA,  TSL 
ADAM:  Ada-based  language  for  multiprocessing,  [LuckSl] 
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specifying  tasking  using  patterns  of  behavior,  [Meld88] 
test  drivers, 

GET  Test  Environment  Generator,  [Bess87] 

TBGEN  Test  Bed  Generator,  [Pout87] 
based  on  UATL:  Universal  Ada  Test  Language,  [Zeig89] 
testing  and  analysis  of, 

ASAP:  Ada  Static  source  code  Analyzer  Program,  [Doub87] 

ATEST:  Ada  Test  and  Analysis  Tools,  [Maye89] 
data  flow  analysis  see  also  STAD, 
mutation  analysis, 
automated  support  for,  [Appe8S] 
reference  manual  for  mutant  operators,  [Bows87] 
safety  analysis  using  fault-tree  analysis,  [Leve83c] 
automated  support  for,  [Cha88] 
standards  checking, 

based  on  DIANA  intermediate  form,  [Bym89] 
static  concurrency  analysis,  [Tayl83a] 
automated  support  for,  [Wamp85] 
supported  by  Petri  nets,  [Mura89],  [Shat88] 
using  TIG  and  TICG  models,  [Long88] 
structural  testing, 

automated  support  for,  [Basi86d],  [Wu87c] 
requisite  support  tools,  [Tayl86a] 
coverage  monitors, 

Test  Coverage  Monitor/Bottleneck  Finder,  [Pout87] 
symbolic  execution,  [Knig85b],  IOGen 
CASEX:  Concurrent  Ada  Symbolic  EXecutor,  [Harr88a] 
symbolic  testing  techniques,  [Clar86b] 

AdaPIC  Toolset,  [Wolf86c] 

Algebraic  Program  Testing,  [Howd76a],  [Howd78b] 
comparison  with  other  techniques,  [Zeil84] 
for  concurrent  systems,  [Avru83] 
probabilistic  approach,  [DeMi77] 

AJgol, 

debugging  tools  for  Algol  W,  [Satt72],  [Satt75] 
numerical  algorithms  testbed,  [Henn78] 
run-time  monitoring, 

Algol68  numerical  algorithms  testbed,  [Henn76a] 

Alphard, 

support  for  formal  verification,  [Wulf76] 

Animation  of  Programs , 

PegaSys:  Programming  Environment  for  the  Graphical  Analysis  of  SYStems,  [Mori83] 

demonstration  system  for  Ada,  [Feld89] 

system  for  algorithm  animation,  [Bent87] 

using  Smalltalk,  [Lond85] 

using  VDM  and  Prolog,  [BI0086] 

Arcadia,  [Wotf86b] 

determining  requirements  for  persistent  object  capability, 

PGRAPHTTE  model,  [Wile88] 

environment  architecture,  [Oste86b],  [Tayl86b],  [Tayl88] 
object  management,  [Oste86a] 
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research  objectives,  approaches,  status,  [Tayl86b],  [Tayl87],  [Tayl88] 
support  for  process  programming,  [Tayl88] 

Arithmetic  Fault  Detection,  see  also  Perturbation  Testing 
Artificial  Intelligence, 

expert  systems, 

ARROWSMITH-P  for  management,  [Basi85g] 
testing  strategy  for,  [Hite88] 
heuristics, 

analysis  of  nature  and  power  of,  [Pear84] 
characterization  of  methods,  [Pear84] 
heuristic  search  for  error  detection,  [Andr81] 
knowledge-based  testing  environment, 
for  kernel  system  calls  of  UNIX  systems,  [Pesc85] 
support  for  debugging,  [Hara83],  STAD,  [Shap81] 

LAURA  to  debug  student  programs,  [Adam80] 

PROUST,  [John84] 

model  of  fault  localization  process,  [Sedl83] 
support  for  domain  modeling,  [Sold89] 

Assertions, 
automated  support  for, 
input/output  assertion  verifier,  [Siya80] 
temporal  assertions,  [Lamp83] 
uses  of, 

break-point  assertions  for  debugging, 

ALADDIN  for  assembly  language,  [Fair79] 
error  detection, 

combined  with  heuristic  search  algorithms,  [Andr81] 
during  design  see  also  DACC, 

formal  verification  see  also  Inductive  Assertion  [Kell76], 
testing  programs  against  formal  specification,  [Majo83] 
verification  of  program  execution,  [Chen76],  PET 
Automata  Theoretic,  [Chow78] 

Availability  Estimation  and  Measurement, 
modeling  systems  with  hardware/software  faults,  [Land77] 
using  data  from  design/code  inspections,  [Gaff88] 

Axiomatic  Proof  Techniques,  [Chan79],  [Hoar75],  [Owic75],  [Owic76] 

BASIC, 

program  testing  assistant,  [Chan84] 

Backtracking  Techniques, 

proving  correctness  from  control  structure  abstraction,  [Gerh76b] 
Bibliography  on,  [Perr83],  [Youn89b] 

SEL  literature,  [SEL82] 
automated  support  tools,  [DeMi87a] 
formal  verification,  fBryk89],  [Lond75],  [Yeh77] 
intermittent  assertion,  [Grie79] 
invariant  assertion,  [Grie79] 
proving  correctness  of  programs,  [Lond70] 
measurement,  [Bryk89] 
metrics,  [Cook82] 
software  quality,  [Boeh78] 
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testing  and  analysis,  [Bryk89],  [DeMi87a],  [Perr88],  [Yeh77],  [Youn89b] 
mutation  analysis,  [Guin87] 

Boundary  Value  Analysis,  [Selb86] 

Boyer- Moore  Computational  Logic,  [Boye79],  [Boye81] 
comparison  with  other  techniques,  [Crai88a],  [Kauf87a],  [Kauf87b] 
examples  of,  [Russ83] 

FM8501  verified  microprocessor,  [Hunt85],  [Hunt87] 

Goedels’  incompleteness  theorem,  [Shan87] 

RSA  Public  Key  Encryption  Algorithm,  [Boye84b] 

Turing  completeness  of  pure  Lisp,  [Boye83] 
verified  assembler,  [Moor88] 
verified  operating  system  kernel,  [Bevi87] 
theorem  prover,  [Boye79],  [Boye80],  [Boye84a],  [Boye88] 

Branch  Testing, 

based  on  concept  of  essential  branches,  [Chus87] 
automated  support  for,  [Chus87] 
comparison  with  other  techniques,  [Howd77c],  [Ntaf81a] 
problems  and  methods,  [Chus87],  [Huan75] 
types  of  errors  found  and  resource  costs,  [Gann79] 

COBOL, 

automated  support  for, 
mutation  analysis,  CMS1  system,  [Hank80] 
test  data  generator,  [Saud62] 
errors,  error-proneness,  error  diagnosis,  [Lite76] 
program  analysis  for,  [A1-J82] 
code-based  model  for  predicting  path  faults,  [Rogg80] 
software  science  analysis  of  programs,  [Shen80] 

Cause-Effect  Graphing,  [Elme73] 

Change  Data, 
applications  of, 

evaluation  of  development,  [Basi82a],  [Weis85c] 
evaluation  of  requirements, 
examples  from  A-7E  requirements,  [Frye81] 

Chief  Programmer  Teams,  [Bake81j 
impact  on  quality,  [Bake72a] 

part  of  an  overall  development  methodology,  [Bake72b] 

Classification  of, 

automated  support  tools,  [Reif75],  [Reif79b] 
testing  and  analysis  tools,  [Mill77b] 
cost  models,  [Ducl82] 
data  flow,  [Fosd76a] 

errors,  [Beiz83],  [JohnXX],  [Ostr84],  [RADC76a] 
error  complexity  (measure  of  detectability),  [Naka89] 
formal  verification  methods,  [Mili84] 
heuristic  methods,  [Pear 84] 
measurement, 

complexity  measures,  [Gors80] 
metrics,  [Basi86c] 
for  Ada,  [Basi84c] 
module  cohesion,  [Emer84] 
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productivity  factors,  [Vosb84] 
reliability  models,  [Rama82] 
program  structure  types,  [Tum80] 
testing  and  analysis  techniques,  [Youn89c] 
dynamic  analysis  techniques,  [Howd81c] 
fault-based  technques,  [More88] 
reliability  validation  techniques,  [Triv80] 

Cleanroom  Software  Development,  [Dyer81b],  pDyer81c],  [Dyer82c] 
certifying  model,  [Curr83] 
certifying  reliability,  [Dyer82b] 

engineering  software  tmder  statistical  quality  control,  [Mill87a] 

evaluation  of,  [Selb85],  [Selb87b] 

project  management  data,  [Dyer81a] 

software  validation,  [Dyer&3j 

statistical  quality  control,  [Dyer85b] 

statistical  testily  for,  [Dyer82a] 

Code  Reading  and  Inspections,  [Faga74],  [Faga76] 
advances  in,  [Faga86] 

comparison  with  other  techniques,  [Hetz76],  [Hwan81],  [Selb86] 
fault  detection  effectiveness/cost  faults,  [Basi85b] 
evaluation  of,  [Selb85] 
experiments  in,  [Cail79],  [Myer78a] 
applications  of  a  probability-based  model,  [Jeli73] 
indicators  of  quality  inspections,  [Buck81] 
uses  of, 

estimating  software  availability,  [Gaff 88] 
investigating  program  correctness,  [Brit88] 
mechanism  for  error  reduction  rates,  [Faga76] 
quality  assurance,  [Bark89] 
role  in  the  software  life  cycle,  [Basi86b] 

Cohesion, 
applications  of, 

measuring  the  design  process,  [Crui80] 
discriminant  metric  for  classification  of,  [Emer84] 
for  generation  of  hierarchical  system  descriptions, 
based  on  data  bindings,  [Selb88a],  [Selb88b] 

Communicating  Sequential  Processes,  [Hoar85] 

ECSP, 

Concurrent  Debugger,  [Baia85],  [DeFr85],  [Late84] 
static  analysis  of  interprocess  communication, 
automated  support  for,  [Baia84] 
calculus  for  total  correctness,  [Hoar81] 
parallel  composition,  [Hoar78] 

proof  systems  for,  [Apt80],  [Apt83a],  [Levi80],  [Levi81],  [Zhou81] 
static  analysis  of,  [Apt83b] 

Communication  Protocols,  [Sari88a] 
certification  of,  [Bart80] 
design  of, 

computer-aided  design  tool  for  testing,  [Barb88] 
design  validation  using  executable  specifications, 
automated  support  for,  [Jard83] 
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formal  methods,  [BochSO] 

perturbation  technique  for  reachability  analysis,  [Zafi80] 
automated  support  for,  [Zafi80] 
production  rules  to  prevent  errors,  [Zafi80] 
types  of  errors  and  their  effects,  [Zafi80] 
for  Ada, 

Remote  Entry  Call/Remote  Procedure  Call,  [Schu81] 
security  analysis, 

Interrogator,  [Mill87b] 

specification  (and  verification)  of,  [Boch78],  [Sari84b],  [Sari88a],  [Zhou81] 
CEDAR  Programming  Environment,  [Fem85] 

Escort  for  integrated  v&v  and  simplification,  [Waka89] 

XESAR  sliding  window  protocol  example,  [Rich87b] 
automated  testing  of,  [Ural84] 
state  transition  approach, 
automated  support  for  see  also  AFFIRM, 
semi-automatic  implementation  of  protocols,  [Boch87b] 
using  Gypsy,  [DiVi82] 
using  Lotos,  [IS087c],  [Najm87] 
using  logic  interpreter  SLOG,  [Choq85] 
using  symbolic  execution,  [Bran78] 
via  projections,  [Lam84] 
testing  and  analysis,  [Rubi82],  [Sabn85] 
based  on  finite  state  machines,  [Holz82],  [Waka89] 
automated  support  for,  [Holz82] 
dynamic  analysis,  [Sari88c] 
based  on  finite  state  transition  model, 
limiting  non-determinancy,  [Jard83] 

conformance  testing  for  ISO-OSI  protocols,  ISO-OSI  [Stee86] 
using  automaton  models,  [Kato86] 
using  checking  sequences,  [Heng87] 
error  detection  with  multiple  observers,  [Dsso85] 
reachability  analysis,  [Zhao86] 

RGA:  Reachability  Graph  Analyzer,  [Morg86],  [Morg87] 
test  generation,  [Sari88a] 

based  on  finite  state  machines,  [Sari82],  [Sari84a],  [Sari84b],  [Sari87],  [Sari88b] 
CONTEST-FSM  tool,  [Forg87] 

931,  [Uyar86] 

using  T-,  U-,  D-  and  W-  methods, 
evaluation  of  fault  coverage,  [Sidh89] 
testing  methods  based  on  formal  specifications,  [Dsso86] 
trace  analysis,  [Boch88],  [Sari88a] 
for  conformance  and  arbitration  testing,  [Boch87a] 

Compiler-Based  Testing, 

example  for  a  record-oriented  text  editor,  [McMu83] 
static  analysis, 

ECSP  interprocess  communication,  [Baia84] 
using  input-output  specifications,  [Haml77a],  [Haml77b] 

Compiler  Techniques,  [Aho86] 
for  code  analysis  and  generation,  [Payt82] 
optimization  for  asynchronous  multiprocessor,  [Hibb82] 
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smart  recompilation,  [Tich86] 

to  support  symbolic  debugging,  [John79] 

type  checking  for  separately  compiled  parts,  [Levy84] 

Compiler  Testing, 
acceptance  testing, 
example  for  Ada  compiler,  [Amor89] 
automated  support  for, 
symbolic  interpretation,  [Same76] 
syntax  machine  to  generate  random  test  cases,  [Hanf70] 
test  generator  based  on  grammars,  [Bazz82],  [Cele80],  [Hous77] 
performance  evaluation,  [Shaw89] 
benchmarks  for  Ada  real-time  compilers,  [Goel89] 
specification  and  verification,  [Pola81j 
Compiler  Verification, 
for  micro  Gypsy,  [Youn86c] 

Complexity,  [INF076] 
computational  complexity, 

Blum  axioms  and  complexity  gaps,  [Boro72] 
applications  of,  [Pipp78] 
inductive  inference,  [Angl76] 
examples  of, 

of  specific  control  flow  measures,  [Howa85] 
theory  of,  [Hart71],  [Pipp78] 
framework  for  research,  [Rabi77] 
process  complexity, 

of  analyzing  synchronization  structure,  [Tayl83b] 
of  modeling  information  systems,  [Mart70] 
of  temporal  logic,  [Sist88] 
programming  complexity, 

impact  of  programming  factors,  [Duns78a],  [Duns78b] 
measures  for,  [Duns77],  [Duns78a],  [Duns78b] 
relation  with  problem  complexity,  [Wood79b] 
testing  complexity,  [Tai80] 
program  complexity,  [Bell74],  [McTaXX] 
as  integral  part  of  development,  [McC176] 
contributing  factors,  [Gors80],  [McC178a],  [Unde63] 
experiment  in,  [Zoln76] 

program  structure,  [Gree76],  [Piwo82],  [Schn77c] 
results  from  a  Delphi  Survey,  [Zoln81] 
control  of,  [Dijk76b],  [McC178a],  [McC178b] 
impact  on, 

error  characteristics,  [Basi82d],  [Grem84] 
error  detectability,  [Gree76] 
maintainability,  [Vess83],  [Wake88] 
programmer  productivity,  [Chen78a] 
relationship  with, 
information  content,  [Shoo79] 
psychological  complexity,  [Evan83b] 
statistical  language  theory  and,  [Laem78] 
psychological  complexity,  [Love77b] 
of  programs,  [Weis74] 
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relationship  with  program  complexity,  [Evan83b] 
resource  complexity, 

coordinating  personnel  activities,  [Theb84] 
statistical  theory,  [Shoo77b] 

Complexity  Measures,  see  also  Software  Science,  Cyclomatic  Complexity  [Sull75] 
a  characteristic  set,  [Elsh84] 
applications  of,  [Kear86] 

assessing  module  accessibility/testability,  [Moha76b] 
cost  estimation,  [Ducl82] 
automated  support  for, 

FORTRANL  static  source  code  analyzer,  [Li87] 
classification  of,  [Gors80] 
comparison  of,  [Li87] 
major  properties,  [BakeSO] 

evaluation  and  validation  of,  [Elsh84],  [Kafu85a],  [Zoln76] 
code  and  structure  metrics,  [Cann85] 
factor  analysis  of  dimensionality  of  metrics,  [Muns89] 
for  assessing  maintainability, 

effectiveness  of  subjective/objective  measures,  [Gibs89] 
measurement  scales  for  characteristics,  [Harr82] 
framework  for,  [Basi80b] 
predictive  value,  [Davi88] 
for  Ada  see  also  Ada, 
problems  with,  [Kear85] 
relationships, 

among  measures,  [Basi83c],  [Lind89],  [Muns89] 
sensitivity  to  program  structuring,  [Evan83a],  [Evan84c] 
with  development  effort,  [Cann85],  [Lind89] 
with  error  characteristics,  [Basi83c],  [Cann85],  [Schn79a] 
specific  measures, 

Chapin’s  measure,  [Chap79] 

Harrison-Magel’s  nesting  level,  [Evan84a] 

Information  Flow  Complexity,  [Henr79],  [Henr81a],  [Kafu88] 
evaluation  of,  [Cann85] 

Invocation  Complexity,  [Kafu88],  [McC178a] 
evaluation  of,  [Cann85] 

Program  Analysis  Complexity  Model,  [McC176] 

Scope  Complexity  Ratio,  [Harr81b] 

Syntactic  Interconnection  Model,  [Wood81c] 
evaluation  of,  [Cann85] 

based  on  information  theory,  [Berl80],  [Chen78a],  [Shoo79] 
based  on  nesting  level,  [Harr81a],  [Piwo82] 
localization  of  variables,  [Rich76] 
control  flow  measures,  [Howa85] 
derived  metrics  slope  and  r  square,  [Basi83c] 
for  clarity  measurement,  [Gord79a],  [Gord79b] 
for  maintenance, 

figure-of-merit  for  modification  complexity,  [Yau78] 
knot  count  measure,  [Blai85a] 
for  productivity  prediction, 
model  of  programming  effort,  [Wood81a] 
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primitive  level  instructions,  [Kitc81] 
for  project  management, 
assessment  of,  [Suno82] 
step  count,  [Suno82] 
for  unstructuredness,  [Wood79a] 
hybrid  metric  with  context  sensitivity,  [Li87] 
internal  and  external  module  complexity,  [Lew88] 
knot  count  measures,  [Bake80] 
evaluation  of,  [Bake79a] 
program  bandwidth,  [Lind89] 
stability  measures,  [Yau78],  [Yau79] 
evaluation  of,  [Cann85] 

sum  of  the  difficulty  of  constituted  elements,  [Bem84] 
system  measures, 

quantifying  complexity  of  clustering  partitions,  [Bela81] 
theoretical  limitations,  [Leac87] 

Computational  Testing, 
as  a  debugging  aid,  [Clar83b] 

Computational  Woric  Measurement,  [Hell72] 

Concurrent  Systems, 
different  ways  of  using  parallelism, 
axiomatic  proof  rules  for,  [Hoar75] 
formal  verification  of, 
nonassertional  approach,  [Lamp79a] 
survey  of  verification  techniques,  [Barr85] 
models  of  concurrency,  Constrained  Expressions,  CSP 
abstract  conceptual  model,  [Kell76] 
geometric  models,  [Carr82],  [Cars84] 
notion  of  synchronization  structure, 
complexity  of  analysis,  [Tayl83b] 
parallel-program  model,  [Kell76] 

relation  of  parallel/nondeterministic,  [Flon78a],  [Flon81] 
semantics  of  concurrency,  nondeterminism,  communication,  [Fran79] 
specification  of, 

language  based  on  process  interactions, 
denotational  semantics,  [Kahn77] 
using  temporal  assertions,  [Lamp83] 
synchronization  structures, 
based  on  rendezvous, 
formalism  of,  [Tayl83b] 

testing  and  analysis,  Static  Concurrency  Analysis 
IN-SYM  test  for  synchronization  errors,  [Tai85c] 
algebraic  techniques,  [Avru83] 
anomaly  detection,  [Bris79] 
automated  support  for,  [Manc83] 
combining  static  and  dynamic  analysis,  [Tayl83c] 
data  flow  analysis,  [Tayl80b],  [Tayl80c] 
automated  support  for  HAL/S  programs,  [Tayl78b] 
error-based  testing,  [Long88] 

examples  from  RC  4000  multiprogramming  system,  [Brin73] 
of  specifications  and  design,  [Tai85a] 
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structural  testing,  [Tayl86a] 
support  for  static  analysis,  [Tayl83b] 
timing  analysis, 

based  on  path  expressions,  [Hsie89] 
theory  of  testing,  [Weis88a] 

Constrained  Expressions, 

DC  DYMOL  translator,  [Ho79] 
behavior  generator  for,  [Aver84] 
design  language,  [DU186] 

for  analysis  of  concurrent/distributed  systems,  [Dill85],  [Dill88c] 
automated  support  for,  see  ATTEST,  [Dill81] 
design  analysis,  [Dill84] 
automated  support  for,  [Avru86] 

Constraint  Logic  Programming, 
for  specification-based  testing,  [Gerh88b],  [Gorl87] 
LEONARDO  project,  [Gerh88a] 

Control  Flow  Analysis,  [Carr82],  [Wood77] 
algorithms  for,  [Hech77a] 

Cost  Estimation, 

approaches,  [Jame77],  [McGa84],  [Nels66] 
conversion  cost-estimation  techniques, 
review  and  analysis  of,  [Hout81] 
macromethodology  for,  [Putn78] 
size  estimation  based  on  data  structure  metrics, 
effort  estimation  based  on  metric  evolution,  [Wang84] 
structural  forecasting,  [Wolv74] 
comparison  of  techniques,  [Roac80] 
data  requirements  for,  [Wolv74] 
effort  estimation,  [Schn78],  [Wals77a],  [Zelk79] 
back-to-front  programming  prediction,  [Wang83] 
relationship  with  other  variables,  [Basi85e] 
using  1  programmer,  4  program  characteristics,  [Chry78] 
examples  of, 

deep  space  networking,  [Taus81] 
from  SEL  resource  forecasting,  [Basi78a] 
resource  utilization  curves, 

Parr  curve,  [Basi81f] 

Rayleigh  curve,  [Pica81] 
adjustments  for  maintenance  effort,  [Wien84] 
based  on  system  structure,  [Parr 80] 
separated  as  work  and  cost  units,  [Jone78] 
software  cost  estimation  study,  [Herd79] 
to  support  management,  [Putn77],  [Putn78],  [Putn79] 

Coat  Models,  [DACS79a] 

COCOMO,  evaluation  and  tailoring  of,  [Miya85] 

Jensen  macrolevel  model,  [Jens83a] 
sensitivity  analysis  of,  [Jens83b] 

PRICE,  [Frei79] 

in  a  life  cycle  case  study,  [Kuhn82] 

SOFCOST:  Grumman’s  cost  estimating  model,  [Dirc81] 
WICOMOM:  Wang  Institute  cost  model,  [Dems82] 
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avionics  software  support  cost  model,  [SYSC83] 
classification  of, 
evaluation  of  classes,  [Ducl82] 
evaluation  of,  [Cook80],  [Thib81] 
extending  to  include  modularity  factors,  [Wood80a] 
for  fault  tolerance  strategies,  [Scot87] 
for  test  planning,  [Brow89],  [Goel81] 
meta-model  for  resource  expenditures,  [Bail80] 
program  size  estimation  model,  [Itak82] 
reflecting  complexity  of  personnel  coordination,  [Theb84] 
review  of,  [Ducl82] 
simulation  models, 

TSL:  Total  Software  Life-Cycle  Model,  [Ducl82] 
size,  complexity,  personnel  skill,  specification  volatility,  [Okad82] 
staffing  implications,  {Taus82] 

Coupling, 
applications  of, 

measuring  the  design  process,  [Crui80] 
for  generation  of  hierarchical  system  descriptions, 
based  on  data  bindings,  [Selb88a],  [Selb88b] 

Coverage  Monitors, 

Program  Testing  Translator,  [Stuc72] 
for  Ada,  see  Ada,  [Pout87] 
principles  and  practices  for,  [Paig77a] 
self-metric  software  see  also  PET 
Cydomadc  Complexity,  [McCa76] 
adaptations  of,  [Bake79a],  [Hans78] 
applications  of, 

aid  to  testing,  [McCa82a],  [McCa82cJ,  [Perr88] 
complexity  measurement,  [Elsh78c],  [Hans78] 
measure  of  program  structuredness,  [McCa76] 

productivity  prediction,  [Blai85a],  [Curt79a],  [Curt79b],  [Curt81],  [Wood81a] 

program  size  estimation,  [Gaff79] 

project  management,  [Suno82] 

support  for  regression  analysis,  [McCa82a] 

comparison  with  other  measures,  [Bake80],  [Gaff79],  [HarrSlb],  [Kitc81],  [Wood79a],  [Wood81a] 
evaluation  of,  [Bake79a],  [Basi81g],  [Evan84a],  [Harr81a] 
ability  to  provide  objective  measure  of  effort,  [Kitc8l] 
anomalies  and  extension  to  overcome,  [Myer77] 
measures  of  comprehensibility,  [Boys79] 
validation  of  across  FORTRAN  programs,  [Basi83b] 
for  Ada,  see  Ada,  [Taus87a] 
relationships, 

with  development  effort,  [Lind89] 
with  other  measures,  [Henr81a],  [Lind89] 


DACC:  Design  Assertion  Consistency  Checker, 
cost-effectiveness  of,  [Boeh75a] 

DACS, 

glossary  of  software  engineering  terms,  [DACS79b] 
quantitative  software  models,  [DACS79a] 
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software  life  cycle  tools  directory,  [DACS85] 

DAJSTS:  Data  Abstraction,  Implementation,  Specification  and  Testing  System,  [Gann80],  [Gann81],  [Haml79], 
[McMu82] 

evaluation  of,  [McMu80] 

DARTS:  Design  Aids  for  Real-Time  Systems,  [CSDL80],  [Furt81] 

DAVE,  [Oste75a],  [Oste75b],  [Oste76a] 
experience  with,  [Fosd76a],  [Oste76b] 

DISSECT,  [Howd77b] 

advantages,  limitations,  and  uses  of,  [Howd76e] 

DREAM:  Design  Realization,  Evaluation  and  Modeling  System,  [Ridd78],  [Ridd79] 

DDN:  DREAM  Design  Notation,  [Ridd78] 

Data  Based  Program  Testing,  [Lask86] 

Data  Bases, 

development  data, 

BCS  software  production  data,  [Blac77] 
comparison  of  RADC  and  SEL,  [Turn81a] 
software  engineering,  [Romb87c] 

Data  CoUection  and  Analysis,  see  also  SEL 
applications  of, 

experimental  research,  [Basi84b] 
for  experimental  research,  [Zelk82] 
management,  [Basi84b] 
approaches, 

SARE:  Software  Acquisition  Resource  Expenditure,  [Duma83] 
data  requirements  for, 
cost  estimating,  [Wolv74] 

reliability  measurement,  [Litt80b],  [McCa87a],  [Thay75] 
examples  of,  [Bake77] 

ASTROS  measurement  program,  [John75] 
for  error  data,  [Fung85],  [Rube75],  [Thib78] 
goal-directed  data  collection, 
based  on  change  data,  [Basi81b] 
four  applications  of,  [Basi85f] 
to  evaluate  development  methodologies,  [Basi82c] 
methodologies  for  evaluating  failure  databases,  [Duva80] 
techniques,  [RADC76a] 
validation  and  analysis,  [Basi80c] 

Data  Flow  Analysis,  [Carr82],  [Fosd76b],  [Herm76] 

algorithms  for,  [Alle74],  [Alle76],  [Bart78],  [Fosd76a],  [Hech75],  [Hech77a],  [Jach84] 
applications  of,  [Oste81a] 
detection  of  some  unexecutable  paths,  [Oste77] 
required  element  testing,  [Ntaf81a],  [Ntaf82] 
support  for  automatic  program  slicing,  [Weis84] 
automated  support  for  see  also  DAVE,  ASSET,  FORTEST,  STAD 
block  testing,  [Lask82] 
criteria, 

Laski-Korel  criteria,  [Lask83] 

Rapps- Weyuker  criteria,  [Rapp80] 
complexity  of,  [Weyu84a] 
comparison  of,  [Clar 85a],  [Clar86a],  [Lask87] 
error  detection  ability,  [Girg86a] 
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selectivity  of  path  selection  criteria,  [Zeil88a] 
feasible  criteria  for  nonexecutable  paths,  [Fran86],  [Fran88] 
d-tree  testing,  [Lask82] 
data  flow  classification,  [Fosd76a] 
for  concurrent  systems,  [Reif79c],  [Tayl80b],  [Tayl80c] 
for  recursive  PL1  programs,  [Rose75] 

formalism  for  specifying  diverse  range  of  sequences,  [Olen86] 
using  node  listings,  [Kenn75] 

Data  Space  Analysis,  [Paig81] 

Debugging,  [Brow73b],  [Rust71] 
and  understanding,  [Luke80] 
architectural  support  for, 
requirements  for,  [John82a] 
automated  support  for, 
applications  of, 

generation  of  program  traces/profiles,  [Satt75] 
desirable  features, 

concurrent  systems,  [Baia85],  [Webe83] 
distributed  systems,  [Garc84] 
real-time  systems, 

reconstructing  execution  host/target,  [Tayl82b] 
survey  of,  [Schw70a] 

techniques  for  improving  efficiency,  [Laue79],  [Satt75] 
automated  tools, 

AIDS:  Advanced  Interactive  Debugging  System,  [Hart79] 

ALADDIN  for  assembly  language,  [Fair 79] 

ECSP  Concurrent  Debugger,  [Baia85],  [DeFr85],  [Late84] 

EXDAMS,  [Balz69] 

FORTRAN  post  mortem  dump  system,  [Ng78] 

Incense  for  displaying  data  structures,  [Myer83] 

PEBUG:  Purdue  Extendable  Debugging  System,  [Blai71] 

for  Ada  see  also  Ada, 

for  Algol  W,  [Satt72],  [Satt75] 

for  dataflow  machines,  [Wahl88] 

for  distributed  systems,  [Schi81] 

TAP,  [Gord86],  [Gord88] 
based  on  EDL  see  also  EDL, 
debugging  commands,  [Stan80] 
for  real-time  systems, 

RED  and  implementation  schemes  for,  [Hill83] 
knowledge-based,  [Hara83],  [Shap81] 

LAURA,  [Adam80] 

PROUST,  [John84] 
symbolic  debugger,  [Brue83],  [John79] 

RAIDE  language  independent  system,  [John78] 

Symbolic  Debug/1000,  [HCP82] 
source  level  debugger  for  HP- 1000,  [John83] 
by  independent  persons,  [Musa76] 
empirical  stopping  rule,  [Form77] 
hierarchical  approach,  [Lask79] 

models  for  assessing  effects  of  process  imperfections,  [Down85a],  [Down86] 
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psychological  study  of,  [Goul72],  [Goul74],  [Goul75] 
role  of,  [Schw70a] 
strategies,  [Laue79] 
supported  by, 

computatioual  and  domain  testing,  [Clar83b] 
error-sensitive  testing,  [Fost83] 
knowledge-based  model  of  fault  localization,  [Sedl83] 
program  slicing,  [Weis84] 

Decision  Tables,  [Pooc74] 

checks  for  redundancy,  consistency,  completeness,  [Pooc74] 

Deductive  Reasoning,  [Dijk68] 

Dependability  Measurement,  see  also  Reliability  Measurement,  Availability  Measurement 
Design  Analysis,  [Balz81],  [Gerr85] 
automated  support  for,  DREAM 
DEC  A,  [Carp75] 

SAMM  modeling  tool,  [Lamb78] 

TINKLER:  interleaving  testing/design,  [Lieb80] 
based  on  formal  specification,  [Gutt80] 
concurrent  systems, 

to  detect  synchronization  errors,  [Tai85a] 
inspections,  see  Code  Reading  and  Inspections,  [Faga76] 
testability  analysis, 
automated  support  for,  [Yin80] 
using  assertions  see  also  DACC, 
using  executable  specifications,  [Davi82b] 
using  finite  state  machines  see  also  Automata  Theoretic, 

Design  Evaluation,  see  also  Coupling,  Cohesion 
application  of  Software  Science,  [Szul81] 
application  of  software  science, 
for  Ada,  [Szul84] 

evolution  of  design  metrics  research,  [Romb88f] 
measures  of  complexity,  [Whit80] 
measures  of  quality,  [HenrXX],  [Troy81] 

Design  Indicators,  [Ross88] 

System  Entropy  Function,  [Moha79] 

System  Work  Function,  [Moha79] 

automated  support  for,  [Szul83],  [Yin78],  [Yin79],  [Yin80] 
metrics  for  embedded  real-time  designs,  [Szul80] 

Development  Environments, 

ARROWSMITH-P  expert  system  for  management,  [Basi85g] 

Cedar  Programming  Environment,  [Teit84j 
Hughes  design  analysis  and  testability  system,  [Yin80] 

IPE:  Incremental  Programming  Environment,  [Medi81] 

Interlisp  programming  environment,  [Teit81] 

LEONARDO  project,  [Conk86] 

PDS  2:  Process  Design  System  2,  [Kopp76] 

SOFTING  Software  Engineering  Environment,  [Snee85] 

SPS:  Software  Productivity  System,  [Boeh84b] 

SSAGS:  Syntax  and  Semantics  Analysis  and  Generation  System,  [Payt82] 

SSES:  Software  Specification,  Evaluation  System,  [Hodg76] 

Toolpack,  [Oste83],  [Oste84] 
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design  principles,  [Tayl85],  [Tayl86b],  [Tayl87] 
guidelines  for  incorporating  metrics,  [Selb87a] 
improvement-oriented,  [Basi88] 
integrated  tool  sets,  [Oste£3] 
sharing  intermediate  representations,  [Lamb83] 
persistent  typed  object  management, 

PGRAPHITE  model,  [Wile88] 
tool  fragment  approach,  [Zeil87] 
user  interfaces,  [Youn88b] 
evaluation  of, 
methodology  for,  [Weid86] 
workstations,  [Koer84] 
for  Ada,  Arcadia 

ARCTURUS,  [Stan83],  [Stan84a],  [Tayl85] 

Alsys  tool  set, 

Ada  program  VIEWer,  [Maug85] 
event-driven,  symbolic  debugger,  [Maug85] 

GRAPHITE:  a  meta-tool  for  development,  [Clar86c] 
based  on  wide-spectrum  languages,  [Luck86a] 
programming-in-the-large,  [Wolf 85 a] 
for  concurrent/distributed  systems, 

MUST  flight  software  production  environment,  [Tayl78b] 
automated  support  under  UNIX,  DEMOS/MP,  [Mill84] 
reverse  engineering, 

ADDS:  Automated  Design  Description  System,  [Arth88] 
role  in  quality  assurance,  [Cher80a],  [Cher80b] 

Distributed  Systems, 

analysis  of  designs,  [Avru85],  Constrained  Expressions 
based  on  modified  Petri  nets,  [Cagl82] 
debugging,  [Schi81] 
measurement  of, 
guidelines  and  standards  for, 
quality  measurement,  [Bowe83] 
model  for  distributed  computations, 

FA/C  Functionally  Accurate/Cooperative,  [Less81] 
computation-communication  model, 
automated  support  under  UNIX,  DEMOS/MP,  [Mill84] 
to  support  distributed  termination,  [Fran80] 
trace  analysis,  [Jard87] 

DoD  Guidelines  and  Standards,  [DeMi87a] 

Defense  System  Software  Quality  Program,  [DODS86] 
process-product  relationships  with  STD-2167,  [Lave88] 

Defense  Systems  Software  Development,  [DOD88] 

Technical  Review/ Audits  for  Systems,  Equipment,  Computer  Programs.  [MIL85] 
independent  verification  and  validation,  [AFSC88a] 
management  indicators,  [AFSC86a] 
quality, 

quality  assurance,  [Army84],  [McWe84] 
specification/measurement,  [AFSC86bj,  [McCa77a] 
survey  of  military  standards,  [Bowe79] 
risk  abatement,  [AFSC88b] 
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test  and  evaluation, 

Software  Test  and  Evaluation  Manual,  [DODD87] 

Test  and  Evaluation  Master  Plan  guidelines,  [DODD86b] 

Test  and  Evaluation,  [DODD86a] 
guidelines  for,  [Army87] 
operational  testing, 
maintainability,  [AFOT87] 
management  guidelines,  [AFOT86] 
supportability,  [AFOT88a],  [AFOT88b] 
usability,  [AFOT82] 

Domain  Testing,  [Whit78b] 
as  a  debugging  aid,  [Clar83b] 
error  analysis  of,  [Whit78a],  [Zeil89] 
loop  analysis  problems,  [Whit88a],  [Wisz87] 
test  data  selection  strategies  and  error  bounds, 

Clarke-Richardson,  complexity  of,  [Hass80] 

White-Cohen,  [Cohe78],  [Pere85],  [Whit86] 
complexity  of  testing  iterated  borders,  [Whit88a] 
evaluation  and  complexity  of,  [Hass80] 

EDL:  Event  Definition  Language,  [Bate81] 

BA:  behavioral  abstraction  approach,  [Bate83a] 
a  basis  for  distributed  system  debugging  tools,  [Bate82] 
automated  support  for,  [Bate83b] 

EQUATE,  [Zeil86] 
complexity  of,  [Zeil88b] 

ESTCA:  Error  Sensitive  Test  Case  Analysis,  [Fost80],  [Fost85] 
application  to  debugging,  [Fost83] 
sensitive  test  data  for  logical  expressions,  [Fost84] 

Economics,  [Boeh81] 
of  fault-tolerance,  [Mign82] 
of  modularization,  [Camp76] 
of  quality  assurance,  [Albe76] 

programming  cost  factors,  [Boeh73],  [Boeh75b],  [Farr65],  [Putn82] 

Encryption  Protocols, 
formal  verification, 
examples  of, 

RSA  Public  Key  Encryption  Algorithm,  [Boye84b] 
testing  and  analysis, 

Inatest,  [Kemm87] 

Environment  Characteristics, 
characteristic  set, 

customizing  to  an  environment,  [Basi85a] 
forecasting  productivity,  [Basi85a] 

Equivalence  Partitioning,  [Selb86] 
automated  support  for, 

AutoParts,  [Soli85] 

Error-Based  Testing,  [Clar83a],  [Ostr79],  [Weyu81] 
extension  to  real-time,  concurrent  systems,  [Long88] 
formalism  for, 

characterizing  completeness  of  tests,  [Howd82a] 
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contrasting  with  other  approaches,  [Howd82a] 
theory  of,  [More84] 

Error  Seeding, 

capture-recapture  sampling,  [Dura81b] 
evaluation  of  seeding  methods,  [Knig85a] 
failure  characteristics  of  syntactic  changes,  [Knig85a] 
issues  involved,  [Knig85a] 

Errors, 

classification  of,  [Beiz83],  [Mend79J,  [Ostr84],  [RADC76a] 
error  causes,  [Boeh75a] 

error  complexity  (measure  of  detectability),  [Naka89] 
errors  occuring  in  real-time  systems,  [Ande83] 
errors  occuring  in  system  programs  methods,  [Endr75] 
for  all  development  phases,  [Bowe80] 
method  for,  [Amor75] 
problems  in,  [Jeli72] 

review  of  classification  schemes,  [Bowe80] 
taking  the  programmer  into  account,  [JohnXX] 
data  collection  and  analysis  needs,  [Fung85],  [Rube75] 
examples  of,  [Garm81] 

errors  from  DOS/VS  operating  system,  [Endr75] 
errors,  error-proneness,  diagnosis  in  COBOL,  [Lite76] 
from  IV&V  projects,  [Fuji77] 
from  special-purpose  editor  system,  [Ostr84] 
experiments  in, 

error  occurence  and  detection,  [Hoff77] 
in  distributed  systems, 
ordering  errors,  [Gord85b] 
influencing  factors,  [Feue79a],  [Gerh76a] 
comments,  [Howd88] 
complexity,  [Basi82d] 
language  factors,  [Nage84] 

program  structure  and  complexity,  [Brad75],  [Gree76] 
programmer  and  problem,  [Nage82],  [Nage84] 
reasoning  errors  made  in  software  construction,  [Howd89a] 
statistics, 

across  environments,  [Weis82] 

algorithm  implementation  errors,  [Bulu74] 

design  errors,  [Boeh75a] 

error  types,  frequencies  and  habitats,  [Schw70a] 

errors  detected  in  development/IV&V,  [Rube75] 

from  a  testing  service,  [Mill79b] 

persistent  errors,  [Glas81] 

recurrent  errors  in  real-time  systems,  [Goel78] 

syntactical  errors,  [Boie72] 

system  program  errors,  [Endr75] 

types,  distribution,  test/correction  times,  [Shoo75] 

Evaluation  Approaches, 
for  evaluation  of, 
computer  models,  [Prat80] 
development  practices,  [Basi81c],  [McGa82] 
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Reduced  Form  for  sharing  complexity  data,  [Harr85] 

analysis  of  change  data,  [Basi81b],  [Basi81e],  [Basl82a],  [Weis81],  [Weis85c] 
automatic  generation  of  artificial  systems,  [Rowl88] 
cluster  analysis,  [Chen81] 
coupling  evaluation  with  measurement,  [Selb85] 
effectiveness  of  elimination  of  faults,  [Curr76] 
error  analysis,  [Glas80],  [Howd80b],  [Weis78],  [Weis82] 
game  theoretic  testing,  [Cher88] 
goal-based  paradigm  for,  [Basi82c],  [Basi85c] 
mutation  analysis  for  test  data  adequacy,  [Ntaf81a] 
other,  [Panz81b] 
procedural  approach,  [Henr85] 
statistical  model  for  evaluating  effectiveness,  [Card87b] 
environments,  [Weid86] 
error  relationships, 
analysis  of  change  data,  [Basi82d] 
experimental  work  in  software  engineering,  [Basi86a] 
human  understanding,  cloze  tests,  [Hall86] 
metrics  (example  from  STARS  program),  [Gord85a] 
reliability  models, 

replicated  experiments,  [Nage82],  [Nage84] 
stepwise  statistical  methodology,  [Troy86] 
software  prototypes,  [Chur86] 
test  data  selection  criteria, 

RELAY  model  of  error  detection,  [Rich86a] 

Examples  of, 
compiler  validation, 

ACVC:  Ada  Compiler  Validation  Capability,  [Will89] 
measurement, 

metrics  applied  to  relational  data  base,  [Redd84a],  [Redd84b] 
safe/reliable  computing  on  Airbus/ ATR  Aircraft,  [Roqu86] 
testing  a  multiprogramming  system,  [Hans73] 
testing  and  analysis  approaches,  [Mill75e],  [Muno88] 

PEI  Testing  Methodology,  [Post87] 
for  nuclear  reactor  protection  systems,  [Geig79] 
case  studies,  [Uren87] 
testing  and  validation,  [Ho78] 
testing  of  the  TRIDENT  CC  system,  [Oxma78] 

Exhaustive  Testing,  [Brow72a],  [Shoo74] 

Experimental  Design,  [Coch50] 
beat  the  system,  [Budd80a] 

behavioral  or  psychological  approaches,  [Broo80a] 
designing  a  measurement  experiment,  [Basi77b] 
reproducible  experiments,  [Come79] 
sampling  theory  and  applications,  [Coch53] 

Extremal-Special  Value  (ESV)  Testing, 
for  fault-tolerant  systems,  [Vouk86b] 

Extremal-Special  Values  (ESV)  Testing, 
for  fault-tolerant  systems,  [Vouk86a] 

FAST:  Fortran  Analysis  System,  [Brow78j 
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FDM:  Formal  Development  Methodology,  [Kemm80] 

Ina  Jo,  [Kemm80],  [Sche85] 
abstract  machine  model  of  a  specification,  [Berr87] 
language  reference  manual,  [Loca80] 
testing  of  specifications, 

Inatest,  [Eckm84],  [Eckm85],  [Kemm85a] 
example  of  analyzing  encryption  protocols,  [Kemm87] 
with  temporal  logic  for  concurrency  properties,  [Wing89] 
comparison  with  other  techniques,  [Cheh81] 
status  and  future  directions,  [Kemm81],  [Kemm86] 
theory  of,  [Berr87] 

FORTEST,  [Girg85] 

experiments  in  error  detection  ability,  [Girg86a] 

FORTRAN, 

automated  support  for,  see  also  FAST,  Mothra,  FORTEST,  DISSECT,  DAVE,  PET 
Automatic  Code  Evaluation  System,  [Hall73],  [Hall74],  [Rama73],  [Rama74a] 

BRANANL  for  identifying  basic  blocks,  [Fosd74] 

FAVS,  [GRC79] 

RXVP  verification  system,  [Mill74d],  [Mill75f] 

FETE:  execution  time  estimator,  [Igna71] 

FORTRANL  static  source  code  analyzer,  [Li87] 

FORTVER  documentation  and  error  diagnosis,  [Conr85] 

NBS  test  programs,  [NBS74] 

SAP:  Static  Source  Code  Analyzer  Program,  [Deck82a] 

SELFMET  for  self-metric  instrumentation,  [Urba73] 
application  of  Software  Science,  [Otte761 
post  mortem  dump  system,  [Ng781 
static  analyzer,  [Slav75] 

symbolic  execution,  [Clar76b],  [Fava79],  [Rama76] 

SADAT,  [Voge80] 
test  drivers, 

test  procedure  language/processor,  [GE77a],  [GE77b],  [Panz76],  [Panz78a],  [Panz78b],  [Panz78c] 
use  of  software  probes,  [Page74] 
how  it’s  used  and  needed  compiler  support,  [Knut71] 
impact  on  reliability,  [Goel88] 

Failure  Mode  and  Effects  Analysis  (FMEA),  [Bunc80],  [Laws83],  [Reif79a] 
example  for  satellite,  launch  vehicle,  reentry  systems,  [SAMS77] 

Fault-Based  Testing, 

RELAY  model  of  error  detection,  [Rich86b] 
applications  of,  [Rich88J 

for  analysis  of  test  data  selection  criteria,  [Rich86a] 
for  testing,  [Rich87a] 
classification  of  techniques,  [More88] 
symbolic  testing,  [More87],  [More88] 
theory  of,  [More87J,  [More88] 

Fault-Tolerance,  [Aviz78] 
and  fault-intolerance,  [Aviz75] 
as  a  basis  for  system  structuring,  [Rand75] 
automated  support  for,  [Wild87] 
evaluation  of  technology,  [Ande85],  [Sliv84] 
examples  of, 
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air  traffic  control  system,  [Aviz87] 
experiments  in,  [Dunh85] 
with  the  SIFT  operating  system,  [Brun85] 
for  dataflow  machines,  [Srin85] 
principles  and  practices,  [Ande81] 
queuing  analysis  of,  [Nico87] 
real-time  systems,  [Ande83] 

relationship  of  fault  tolerance/elimination  techniques,  [Shim88] 
reliability  evaluation, 

based  on  directed  acyclic  graphs,  [Sahn87] 
strategies,  see  also  Recovery  Blocks,  N-Version  Software 
comparison  of,  [Gma80a],  [Scot84b],  [Scot87] 
correspondent  computing, 
implementation  for  Ada,  [Lee89a] 
cost  model  for,  [Mign82],  [Scot87] 
detector  redundant  scheme,  [Han76] 
modeling  of,  [GrnaSOb] 
redundant  data  structures,  [Blac81],  [Tayl80a] 
repetitive  run  modeling  for  failure/fault  estimation,  [Dunh86] 

Fault-Tree  Analysis,  [Harv82],  [Leve83b],  [Mcln83],  [Vese81] 
automated  support  for,  [Rola86],  [Stol84] 
for  Ada,  [Leve83c] 

for  both  hardware  and  software,  [Han76] 

Finite  State  Machines, 
applications  of.  Communication  Protocols 
interpretation  correctness,  [Ferr77] 
model  environment  for  validation,  [Hend75] 
requirements  modeling  for  testability,  [Chan85] 
specifying/verifying  data  abstractions,  [Shan82] 
testing  correctness  of  control  structures,  [Chow78] 
estimates  of  software  size  from,  [Brit82] 
state  machine  specification  technique,  [Prin78] 

Flavor  Analysis,  [Howd87],  [Howd89a] 

Flow  Expressions, 

for  specification  of  concurrent  systems,  [Shaw78] 

Formal  Verification  (Hardware) , 

examples  of, 

FM8501  verified  microprocessor,  [Hunt85],  [Hunt87] 
reusable  library,  [Bevi88] 
test  vector  generation,  [Vose88] 

Formal  Verification,  see  also  Invariant  Assertion,  Intermittent  Assertion,  Induction 
automated  support  for, 

modification  of  first-order  rules  for  algebraic  expresssions,  [Sark89] 
practical  problems,  [Boye84a],  [Luck77] 
state  of  the  art,  [Crai86],  [Crai87a] 
survey  of, 

mechanical  support  for  formal  reasoning,  [Lind88d] 
theorem  provers,  [Elsp72a] 

automated  tools  see  also  m-EVES,  AFFIRM,  Gypsy,  HDM,  Boyer-Moore  Computational  Logic,  Boyer- 
Moore  Computational  Logic,  FDM,  Stanford  Pascal  Verifier 
interactive  program  verification  system,  [Deut73] 
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logical  basis  and  implementation,  [Igar73] 
program  verifier,  [King69],  [King70] 
concurrent/distributed  systems  see  also  Temporal  Logic, 

EBS:  Event-Based  Specification  Language, 
comparison  of, 

EBS  with  temporal  logic  and  trace  approaches,  [Chen83] 

Misra-Chandy’s  proof  method,  [Misr81] 
nonassertional  approach,  [Lamp79a] 
problems  in,  [Grie77] 

proving  total  correctness,  [Flon78b],  [Flon81],  [Misr82] 
proving  weak  correctness,  [Flon78a] 
examples  of, 

Byzantine  Generals  problem,  [Lamp82] 

Dijkstra’s  garbage  collector,  [Grie77] 
for  a  reactor  protection  system,  [Ehre73] 
proof  of  a  calendar  program,  [Lamp79b] 
proof  of  a  program:  FIND,  [Hoar71b] 
proof  of  computer  interval  arithmetic,  [Good70] 
for  Ada,  see  also  Ada 
impact  of  language  design,  [Wiilf76] 
interpretation  correctness,  [Ferr77] 
introduction  to,  [Berg82],  [Grie76],  [Lond75] 
methods, 

classification  of,  [Mili84] 

constructive  approach,  [Good75e],  [Hoar72],  [Wegb77] 
in  support  of  transformational  programming,  [Krie86] 
heuristic  approach,  [Katz73] 
stacking  approaches,  [Hunt87],  [Moor88] 
survey  of  theory  and  techniques,  [Elsp72a] 
of  compilers, 

for  micro  Gypsy,  [Moor88] 
of  structured  programs,  [Ling79] 
principles  of,  [Good79b] 

proofs,  completeness,  transcendentals  and  sampling,  [Davi77] 
state  of  the  art,  [Kemm86],  [Oste80] 
prospects  for,  [Dahl78],  [DeMi79a],  [Fetz88] 
supported  by, 

abstract  data  types,  [Flon77] 
automated  support  for,  [Gutt78a] 
control  structure  abstraction,  [Gerh76b] 
program  traces,  [Howd78c] 

state  machines  for  interpretation,  program,  implementation  correctness,  [Ferr77] 
symbolic  execution,  [Burs74j,  [Dill87] 
for  communication  protocols,  [Bran78] 
limitations,  (dis)advantages  of  methods,  [Dill88a] 

Function  Point  Analysis, 
applications  of,  [Symo88] 
productivity  measurement,  [Albr79],  [Albr81] 
with  a  productivity  index,  [Behr83] 
comparison  with  other  measures,  [Albr83] 
estimating  handbook,  [Zwan84] 
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partial  alternative  method,  [Symo88] 

review  of  metric  derivation/calibration,  [Vem89] 

Functional  Analysis,  [Howd87] 
data  type  transformation  analysis,  [Howd86] 
functional  trace  analysis,  [Howd86] 
operator  sequence  analysis,  [Howd86] 

Functional  Testing,  [Elme71],  [Howd81a],  [Howd86],  [Howd87] 
applications  of, 

module  and  integration  testing,  [Howd85] 
security  testing,  [Glig87] 
based  on, 

Basic  User  Perceived  functions,  [HennXX] 
algebraic  data  type  specifications,  [Boug86],  [Choq86] 

Prolog  interpreter,  [Boug86],  [Choq86] 
design  abstractions,  [Howd80a] 
forma1  specification,  [Lask88a] 
backtracking  issues,  [Mill75c] 
category-partition  method, 
automated  support  for,  [Ostr86],  [Ostr88] 

comparison  with  other  techniques,  [Basi85b],  [Howd80c],  [Hwan81],  [Selb86] 
reliability  of,  [Howd80c] 

relationship  of  test  data  to  operational  usage,  [Basi84a] 
test  control  process  for,  [Elme69] 

General, 

analysis  of  validation  techniques  for  scientific  programs,  [Howd79],  [Howd80b] 
basic  text  on  testing,  [Myer79] 

effectiveness  of  static,  dynamic  techniques,  [Howd80b] 
formal  methods, 
prospects  for,  [Levi78] 
formal  program  testing,  [Cart81] 
issues  of  liability,  [Joyc87a] 

number  of  tests  necessary  to  verify  a  program,  [Shoo79] 
problems  in  large-scale  system  development,  [Broo75] 
program  test  methods,  [Hetz73] 
software  validation,  [Carr80] 

testing  for  an  individual  programmer  with  limited  resources,  [Bran80] 
why  does  software  die,  [Brow80a] 

Glossary  for, 
debugging,  [John82b] 

software  engineering,  [Babs83],  [DACS79b],  [IEEE83a] 
software  tools  and  techniques,  [Reif79b] 

Grammars, 

attribute  grammars, 

for  test  data  generation,  [Dunc78],  [Dunc81] 
relaung  logic  programs  with,  [Dera85] 
context-free  grammars, 
for  test  data  generation, 

Mockingbird,  [Gori87] 

for  compiler  testing,  [Bazz82] 

for  testing  parsers/debugging  grammars,  [Purd72] 
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for  test  plan  generation,  [Baue79a] 
formal  grammars, 
for  compiler  testing, 
example  of  verification,  [Hous77] 

Graph  Theory,  [Cant89],  [Stig74] 

Dilworth’s  theorem  for  acyclic  digraphs,  [Ntaf79],  [Ntaf81b] 
algorithms  for, 

available  expressions  at  entrance,  [Ullm73] 
path  building,  complexity  of,  [Gabo76] 
applications  of, 
control  flow  analysis, 
graph-theoretic  constructs  for,  [Alle71] 
data  flow  analysis,  [Hech75] 
design  simulation,  [Schn77b] 

partitioning  to  highlight  element  relationships,  [Paig75] 
performance  and  reliability  analysis,  [Sahn87] 
predicting  execution  behavior,  [01de83] 
program  design  and  debugging,  [Schn79b] 

testing,  [Beiz83],  [Fosd76a],  [Jach84],  [Paig72],  [Paig78a],  [Stic78] 
path  cover  problems,  [Ntaf81b] 
reducible  flow  graphs,  [Hech72] 
review  of  partitioning  methods,  [Paig77b] 

Gypsy  Verification  Environment,  [Good75c],  [Good84b] 

Gypsy  language,  [Ambl76a],  [Ambl76b] 
verified  compiler  for  micro  Gypsy,  [Youn86c] 

comparison  with  other  techniques,  [Cheh81],  [Crai88a],  [Kauf87a],  [Kauf87b] 
examples  of, 

message  flow  modulator,  [Good82b] 
proof  of  a  distributed  system,  [Good82a] 
verification  of  communication  protocols,  [DiVi82] 
verification  of  security  kernels,  [EPI82] 
status  and  future  directions,  [Good86a],  [Kemm86] 
symbolic  execution  of  concurrent  systems,  [Eckm83a] 

HDM:  Hierarchical  Development  Methodology,  [Elsp72b],  [Elsp73],  [Elsp74],  [Robi79],  [Silv79] 
EHDM,  [Crow85a] 
specification  language,  [Crow85b] 

Muse  to  enhance  HDM  for  A1  certification,  [Halp87] 

SPECIAL:  SPECIfication  and  Assertion  Language,  [Robi77],  [Roub77],  [Rush84] 
comparison  with  other  techniques,  [Cheh81],  [Gogu80],  [Mill81b] 
examples  of, 

verification  of  the  Provably  Secure  Operating  System,  [Neum75] 
verification  of  the  SIFT  operating  system,  [Gold80],  [Mell82],  [Stan84b] 
status  and  future  directions,  [Kemm86] 

Hoare’s  Logic, 

generalized  to  concurrent  programs, 
relation  to  Pnueli’s  temporal  logic  formalism,  [Lamp80] 
the  Decomposition  Principle  meta-rule,  [Lamp84] 
survey  of  results  of  application,  [Apt81] 

Human  Factors, 

behavioral  analysis  of  programming, 
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frequency  of  syntactical  errors,  [Boie72] 
cognitive  psychology, 
and  Software  Science,  [Coul83] 
cognitive  science  of  programming,  [Curt83] 
display  techniques  to  facilite  comprehension,  [Vemu80] 
problem  solving  capabilities/performance,  [Grif72],  [Love77a] 
theory  of  the  leamable,  [Vali84] 
types  of  programming  knowledge,  [Solo84] 
experiments  in,  [Basi79b] 
effects  of  modem  coding  practices,  [Shep79] 
for  developing  quality  software,  [Shne77c] 
impact  of  degree  of  discipline,  [Basi78b] 
impact  of  flowcharts,  [Shne77a] 

impact  of  specification  symbology/spatial  arrangement,  [Shep81] 
influences  on  understanding,  [Shep77] 
methods  for,  cloze  tests,  [Hall86] 
program  comprehension,  [Boys79] 

psychological  study  of  debugging,  [Goul72],  [Goul74],  [Goul75] 
team  design,  [Basi78b],  [Reit79] 
factors  in  team  programming,  [Theb83] 
fault  tolerance, 

influence  of  programmer  profiles  on  coincident  errors,  [Vouk85b] 
human  errors  in  programming,  [Youn74] 
mental  effort  related  to  program  clarity,  a  measure  of,  [Gord77] 
program  structure, 

impact  on  program  understanding,  [Love77b],  [Miar83] 
psychological  complexity, 

of  maintenance  tasks,  [Curt79a],  [Curt79b],  [Curt81] 
relationship  with  software  complexity,  [Evan83b] 
psychology  of  programming,  [Shne80],  [Wein71] 
review  of  research,  [Shei81] 

IEEE  Guidelines  and  Standards, 

configuration  management,  [IEEE83c] 
measures  to  produce  reliable  software,  [IEEE87] 
quality  assurance  plans,  [IEEE84] 
software  engineering  terms,  [IEEE83a] 
software  quality  metrics  methodology,  [IEEE88] 
test  documentation,  [IEEE83b] 

IOGen,  [Jenk86],  [Lind85],  [Lind88a],  [Lind88b],  [Lind88c] 
TESTgen,  [Coff87] 
typing  mechanism  for  CAIS,  [Lind87] 

ISO-OSI, 

conformance  testing  methodology /framework, 
abstract  test  suite  specification,  [IS087b] 
general  concepts,  [IS087a] 

Incremental  Analysis, 

automated  support  for  see  also  PIC,  AdaPIC  Toolset 
for  logic  programming, 

GCLP:  Generic  Constraint  Logic  Programming,  [Wild88] 
sources  of  incompleteness,  [Wild88] 
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Independent  Verification  and  Validation,  [JLC84] 

DoD  guidelines  and  standards,  [AFSC88a] 

evaluation  of  methodology  for  flight  dynamics,  [Page85] 

for  certification  of  minimum  testing  criteria,  [Sork79] 

planning  and  conduct,  [Fuji77] 

practical  experience  with,  [Page84] 

role  of  independent  validation  agency,  [  Agil76] 

Induction,  [Ande79a] 
computational  induction, 
subgoal  induction,  [Morr77] 
generator  induction  for  data  structures,  [Wegb76] 
inductive  assertion,  [Gall81],  [Hant76],  [Kell76],  [Lamp77],  [Lond75] 
automated  support  for  see  also  HDM,  Gypsy  Verification  Environment 
inductive  inference,  [Angl80],  [Angl83],  [Blum75] 
computational  complexity  of,  [Angl76] 
to  support  investigation  of  program  testing,  [Cher86] 
proofs  of  equational  theories  with  constructors,  [Huet80] 

Information  Flow  Analysis,  [Carr82] 

Instrumentation,  [Prob80] 

SELFMET  for  self-metric  FORTRAN  instrumentation,  [Urba73] 
applications  of,  Coverage  Monitors,  Rim-Time  Monitoring 
collecting  program  attribute  values,  [Huan78] 
detection  of  data  flow  anomalies,  [Huan79] 
profile  keeping,  [Knut71] 

optimal  measurements  for  frequency  counts,  [Knut73] 
distributed  environments, 

METRIC:  a  kernel  instrumentation  system,  [McDa77] 
software  probes, 
for  testing  FORTRAN,  [Page74] 
optimal  placement  of  monitors,  [Rama75b] 
statement  contrasted  with  branch  probes,  [Prob82c] 

Integrated  Application  of  Techniques, 
automated  support  for  see  also  Toolpack,  TEAM 
benefits  of, 

achieved  in  the  PIMS  Trending  Project,  [Post87] 
experiments  in,  [Selb86] 

formal  verification,  testing,  analysis,  [Oste80],  Partition  Analysis 
investigative  approaches, 
state-space  exploration,  [Youn89c] 
measurement,  [Kafu81] 
and  documentation,  [Schr84] 
proofs,  analytic  models,  testing,  [Triv80] 
testing  and  analysis,  [Clar 82],  [Howd82b],  [Oste81b],  [Oste84] 
combining  static  and  dynamic  analysis,  [Tayl83c] 
with  symbolic  execution  and  formal  verification,  [Oste80] 
fault-based  techniques,  [Youn88a] 
functional  and  structural  testing,  [Clar78a] 
functional  testing,  [Howd85],  [Howd87] 
mutation  and  perturbation  testing  see  also  EQUATE, 
test  generation  with  design,  [Lask8Sa] 

Integration  Testing, 
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auditing  of  SOFTING,  [Snee85] 
genesis  of  discrepancies,  [Jeli72] 
military  standards  and  metrics,  [Bowe79] 
white  box  approach,  [Hale82] 

Interface  Analysis, 

automated  support  for  see  also  IOGen,  SADMT,  PIC,  AdaPIC  Toolset 
comparison  with  other  techniques,  [Howd77c] 
for  program  structuring,  [Trio86] 
interface  control, 

formal  model  for,  [Wolf85b],  [Wolf86a] 
modeling  stabilization  of  a  large  system,  [Hane72] 

Interface  Specifications, 
input-output  specifications,  [Haml77a] 
organization  for  specifying  abstract  interfaces,  [Clem84] 

Intermittent  Assertion, 

correctness  of  continually  operating  programs,  [Mann78] 
total  correctness,  [Grie79] 
translating  other  proofs  into,  [Grie79] 
validity  of  program  transformations,  [Mann 78] 

LISP, 

Metric  for  analysis  of  program  performance,  [Wegb75] 
formal  verification, 

Turing  completeness  of  pure  Lisp,  [Boye83] 
test  data  for  proving  LISP  programs,  [Budd78c] 
testing  and  analysis, 
mutation  analysis,  [Budd80a] 

Language  Design, 

approaches  for  improved  testing/analysis,  [Kosy73] 
experiments  in, 

design  principles  to  promote  reliability,  [Gann75J 
effects  of  high-  and  low-level  languages,  [Bish86] 
graphical  vs  textual  design  languages,  [HenrXX] 
impact  of  static  typing  and  typeless,  [Gann76],  [Gann77] 
language  features,  stylistic/design  techniques,  [Shne75] 
nonprocedural  languages  and  productivity,  [Hare82] 
requirements  for, 
exception  handling,  [Good75d] 
formal  verification,  [DeMi79a],  [Wiilf76] 
module  interconnection  languages,  [DeRe76] 
powerful  checking  by  compilers,  [Will79] 

Language  Specification, 

denotational  semantics:  Scott-Strachey  approach,  [Stoy77] 
dynamic  grammar,  [Hanf70] 
semantics  of  programming  languages, 
expressed  in  SEMANOL(73),  [Ande76b] 
using  Petri  nets  (for  Ada  tasking),  [Mand85] 

Lines  of  Code  (LOC), 
applications  of, 

predicting  productivity,  [Curt 79a],  [Curt79b],  [Curt8l],  [Hals77d],  [Wood81a] 
comparison  with  other  measures,  [Albr83],  [Wood81a] 
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Loop  Analysis, 

automated  support  for,  [Wate79] 
by  solving  first  order  recurrence  relations,  [Chea78] 
determining  provability/unprovability,  [Meye67] 
heuristic/extraction  for  predicate  synthesis,  [Wegb74] 

MAP,  [Warr82] 

using  static  analysis  to  support  debugging,  [Tisc83] 

Machine  Architectures, 
dataflow  machines, 

distributed  debugging  methodology/simulator,  [Wahl88] 
simulated  program  execution,  [Land79] 
software  development  tools  for,  [Jarr84] 
support  for  fault  tolerance,  dataflow  graphs,  [Srin85] 
vector  processors, 

for  mutation  analysis,  [Gali87a],  [Gali87b],  [Gree87],  [Krau86],  [Ligo87],  [Math86] 
mutant  unification,  [Mat h88a],  [Math88b],  [RegoXX] 
transformation  techniques,  [Math87a],  [Matb87b] 
unified  scheduling  of  mutants,  [Krau88] 

Maintainability, 

definitions  of,  [Gelp79],  [Gilb79] 
experiments  in, 

effectiveness  of  subjective/objective  measures,  [Gibs89] 
relationship  with  system  structure,  [Gibs89] 

influencing  factors,  [Grad87a],  [Grem84],  [Lohs84],  [Romb87a],  [Romb87b],  [Shep78],  [Vess83],  [vanH68] 
measurement  of,  [Feue79a],  [Romb89a] 
a  case  study,  [Blai85b] 

figure-of-merit  for  modification  complexity,  [Yau78] 
stability  and  modifiability,  [Romb87a],  [Romb87b] 
characteristic  metric  set,  [Romb84] 
stability  measure,  [Yau79] 

using  complexity  metrics,  [Bem84],  [Harr82],  [Wake88] 
using  quality  metrics,  [Henr88a],  [Kafu85b] 
via  questionnaires,  [AJFOT87] 
ripple  analysis,  [Hane72],  [Hsie82],  [Yau78] 
testing  of,  [Gelp79] 

Mathematical  Foundations, 
boolean  algebra,  [Beiz83] 

fallibility  in  mathematics  and  programming,  [Gerh76a] 
for  structured  programming,  [Mill72b] 
formal  notations  for  design,  [Hoar87] 
mathematical  theory  of  computation,  [Mann74] 
predicate  calculus, 

notions  of  extension  and  equivalence,  [Gall81] 
regression  analysis, 

analysis  of  variance  and  regression,  [Dunn74] 
multiple  linear  regression,  [Drap66] 
sampling  theory,  [Haml87] 
problems  in,  [Haml86] 
statistical  theory, 

in  information  content  and  complexity,  [Shoo77b] 
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stochastic  processes,  [Cinl75] 

Measurement  and  Evaluation  Systems, 

AMS:  Automated  Measurement  System,  [Sief8S] 

Mentor  for  measurement/documentation,  [Schr84] 

SMDC:  Software  Metrics  Data  Collection  System,  [Yu88a] 
for  Ada,  see  also  TAME,  ADAMAT 
Metrics,  [Gilb76]  see  also  Quality  Measures 
applications  of, 

identifying  error-prone  software, 
decision  tree  framework,  [Selb87c] 
review  of  process,  product  measures,  [Shen85] 
performance  evaluation,  [Lync81] 
reliability  measurement  of  military  systems,  [Koss88] 
size  and  effort  estimation,  [Wang84] 
software  development  management,  [Gaff81a] 
support  for  allocation  of  resources,  [Shen85] 
team  design,  [Theb83] 

candidate  top  10  list  of  metric  relationships,  [Boeh87] 
characteristics  set  of  cost/quality  metrics,  [Selb85] 
classification  of,  [Basi86c] 
critical  issues,  [Ejio87] 
frameworks  for, 

decision  trees,  [Selb87c],  [Selb89] 
introduction  and  overview,  [Cook82],  [Duns83] 
metrics  and  models,  [Cont86] 
selection  of, 

supported  by  measures  of  yield  and  coverage,  [Kafu85a] 
units  of  measure,  [Jone78] 
alternative  to  lines  of  code, 
based  on  Deviation-values  (D-values),  [Miya87] 
validation  of,  [Kafu88],  [Perk86] 
difficulties  in  evolving  and  validating,  [Gaff81a] 
framework  for  evaluation  of, 
example  from  the  DoD  STARS  program,  [Gord85a] 

Modal  Logic, 

relation  of  Manna’s  and  Floyd’s  techniques,  [Burs74] 

Mothra,  [DeMi86b],  [DeMi87d],  [DeMi88a] 
design  principles,  [DeMi87b] 
functional  capabilities,  [DeMi86a] 
interpreter  requirements,  [Offu87] 
testing  of  Mothra,  [BowsXX] 
thematic  tools  for  testing,  [DeMi87b] 
user  manual,  [Guin87],  [SERC87] 

Multiple  Domain  Test  Coverage,  [Redw83] 

Mutation  Analysis,  [Acre79],  [Budd78a],  [DeMi79b],  [DeMi87b] 
a  measure  of  test  data  adequacy, 
used  in  comparison  of  testing  techniques,  [Ntaf81a] 
applied  to, 

Ada  see  also  Ada, 

LISP,  [BuddSOa] 

decision  table  programs,  [Budd78b],  [Budd80a] 
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numerical  software,  [Henn81] 
automated  support  for,  Mothra 
Ada  and  FORTRAN,  [Appe88] 

CMS1  for  COBOL,  [Hank80] 

EXPER  for  FORTRAN,  [Budd80a],  [Budd80c] 
portable  mutation  testing  suite,  [Budd83a] 
users  guide  to  the  pilot  mutation  system,  [Budd77] 

using  vector  processors,  [Gali87b],  [Gree87],  [Krau86],  [Ligo87],  [Math86] 
mutant  unification,  [Math88a],  [Math88b],  [RegoXX] 
transformation  techniques,  [Math87a],  [Math87b] 
unified  scheduling  of  mutants,  [Krau88] 
vectorizadon  over  multiple  data  sets,  [Gali87a] 
competent  programmer  hypothesis, 
formal  analysis  of,  [Gour81],  [Gour83] 
theoretical  and  empirical  studies,  [Budd80a] 
constraint  based  test  data  generation,  [DeMi87c] 
coupling  effect  hypothesis,  [DeMi78] 
theoretical  and  empirical  studies,  [Budd80a],  [Budd80b] 
different  forms  of, 
firm  mutation  analysis,  [Wood88] 
see  Weak  Mutation  Analysis,  [Girg85] 

see  also  Specification  Mutation  see  also  Specification  Mutation, 
syntax  directed/semantics  aided,  [Wu87a],  [Wu87b],  [Wu88] 
effectiveness  of,  [Awr.od] 
integrated  with  ^e-'urbation  testing, 
automated  support  for  see  also  EQUATE 
problems  aud  solutions,  [Budd81],  [Ridd80j 
determining  dead  or  alive,  [Wood88] 
determining  equivalence,  [Budd80b] 
heuristics  for,  [Bald79] 

generation  of  mutation-adequate  test  data,  [DeMi88a] 
stability  of  test  data,  [Bum78] 
state  of  the  art,  [Lipt78] 

N-Verslon  Software,  [Chen78b] 
advantages  and  limitations,  [Aviz84],  [Bish86] 
applications  of, 

for  tolerance  of  design  faults,  [Aviz85] 
software  testing,  [Bril87],  [Shim88] 
automated  support  for, 

DEDLX  distributed  supervisor/ testbed,  [Aviz85] 
coincident  errors,  [Eckh85] 

computing  reference/observed  distribution,  [Vouk85c] 
evaluation  of  assumption  of  independence,  [Knig86a] 
influence  of  programmer  profiles,  [Vouk85b] 
testing  for  version  independence,  [StJe85] 
experiments  in,  [Aviz77],  [Gmei79],  [Knig84] 
analysis  of  faults  in,  [Bril84] 
failure  probabilities,  [Knig86b] 
specification  of,  [Kell82],  [Kell83] 
reliability  evaluation,  [Dunh86] 
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Markov  model  for,  [Sone80] 

data  domain  model  for,  [Scot83a],  [Scot83b],  [Scot87] 
validation  of,  [Scot84a],  [Scot84b] 
testing  and  analysis, 
back-to-back  testing,  [Bish86] 
based  on  use  of  VDM  and  Prolog,  [BI0086] 
extremal-special  value  testing,  [Vouk86b] 
extremal-special  values  testing,  [Vouk86a] 
random  testing,  [Vouk86a],  [Vouk86b] 
effectiveness  of,  [Vouk85a],  [Vouk85c] 
structural  testing,  [Vouk86a],  [Vouk86b] 
theoretical  basis  for  study  of  redundant  software, 
choice  of  “n”,  [Eckh85] 
effectiveness  of,  [Eckh85] 

Operating  Systems, 

FTOS:  Fault  Tolerant  Operating  Systems,  [Sone81] 
performance  evaluation  of,  [Rums77] 
security  verification,  see  also  FDM,  Gypsy,  AFFIRM 
comparison  of  techniques,  [Cheh81] 
specification  for,  [Kore84] 
requirements  for,  [Step74] 
structure  of, 

evaluation  of  based  on  information  flow,  [Henr79] 
modularity  considerations,  [Schn77c] 
monitors  as  a  structuring  method,  [Hoar74] 
test  control  process  for  functional  testing  of,  [Elme69] 
Operational  Usage  Profiles, 
measures  of  testing  representativeness, 
estimator  for  operational  usage  reliability,  [Brow75] 
representativeness  of  functional  test  data,  [Basi84a] 
sampling  theory, 

problems  in,  [Haml86],  [Haml87] 
specification  of,  [Brow75] 
test  cases  to  cover  entire  input  domain,  [Nels78] 

Oracles, 

based  on  specifications,  [Ande76b] 

T-3  Testing  Tool,  [Lawr87] 
using  algebraic  axioms  see  also  DAISTS, 
pseudo  oracles,  [Davi81] 
the  oracle  assumption,  [Weyu80a] 
reasonableness  of  and  consequences,  [Weyu82] 

PACE:  Product  Assurance  Confidence  Evaluator, 

FLOW,  part  of  the  PACE  system,  [Brow72a] 

PET,  [Stuc73],  [Stuc74],  [Stuc75a],  [Stuc77] 

PIC:  Precise  Interface  Control,  [Wolf85b],  [Wolf85c],  [Wolf86a] 
for  Ada  see  also  AdaPIC  Toolset 

PL1, 

automated  support  for, 

EFFIGY  for  symbolic  execution,  [King75a],  [King76] 
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data  flow  analysis,  [Rose75] 
path  testing,  [Bagg80] 
numerical  profile, 

using  Software  Science,  [Elsh76a],  [Elsh76b],  [Zweb79] 

PSL  PSA,  [Teic74],  [Teic77] 

Partial  Evaluation, 
applications  of,  [Beck76] 
interpretive  and  compiling  methods,  [Beck76] 

Partition  Analysis,  [Clar84],  [Rich81a],  [Rich81c],  [Rich85b] 
effectiveness  of,  [Rich82] 
examples  of  application,  [Rich81b] 
specifications  for,  [Rich81d] 

Pascal, 

automated  support  for,  [Kem81] 

GRAPHTRACE  interactive  trace  of  heap,  [Getz83] 

Pascal  validation  suite,  [Wich79] 
data  flow  analysis  see  also  ASSET, 
formal  verification  see  also  Stanford  Pascal  Verifier, 
knowledge-based  debugging, 

PROUST,  [John84] 
symbolic  execution, 

UNISEX:  a  Unix-based  executor,  [Eckm83b],  [Kemm85b],  [Soli83] 
with  path  expressions,  [Camp79] 

Path  Analysis, 

constrained  path  problems ,  [N taf79] 
effectiveness  of, 
for  testing  predicates,  [Zeil81b] 
finding  minimum  path  cover,  [Ntaf79],  [Ntaf81b] 
solving  nonlinear  inequalities,  [ElspXX] 
unexecutable  paths, 

allegations  to  avoid  unfeasibility  problems,  [Wood80b] 
detection  supported  by  data  flow  analysis, 
heuristics  for  detecting  some  classes  of,  [Oste77] 

Path  Expressions, 

applications  of, 

specification  of  process  synchronization,  [Camp74],  [Camp79] 
for  timing  analysis,  [Hsie89] 
path  rules  variant  for  debugging,  [Brue83] 

Path  Testing,  [Beiz83] 
automated  support  for, 

PL1  programs,  [Bagg80] 
test  drivers,  [Shoo79] 

comparison  with  other  techniques,  [Dura81a] 
generation  of  test  data,  [Howd75a] 
path  prefix  testing  strategy,  [Prat87] 
reliability  of,  [Howd76c],  [Pimo75] 

Performance  Evaluation, 
a  metrics  success  story,  [Lync81] 
applications  of, 
as  a  design  tool,  [GilkXX] 
modeling  for  program  optimization,  [Shol75] 
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automated  support  for, 
performance  modeling,  [Ches77] 
basic  quantities, 

computing  based  on  queuing  network  models,  [Denn78] 
definitions  of,  [Denn78] 
operational  relationships  between,  [Denn78] 
determining  upper  bound  on  running  time,  [Meye67] 
for  operating  systems,  [Rums77] 
issues  faced  and  alternative  techniques,  [Warn72] 
of  Ada  programs,  [Lee89b],  [Stan83] 
of  compilers, 

discrimination  rate,  [Shaw89] 

relation  between  compile  time  and  modularity,  [Shaw89] 
test  generator,  [Bazz82] 
of  concurrent  systems, 
based  on  directed  acyclic  graphs,  [Sahn87] 
of  real-time  system  programs,  [Ginz65] 
response  times  of  level  structured  systems,  [Hart84] 

state  transition  balance,  one-step  behavior  and  homogeneity  concepts,  [Denn78] 
supported  by, 

abstract  data  types  with  performance  data,  [Boot80] 
closed-form  expressions,  [Wegb75] 

Metric  for  LISP,  [Wegb75] 
formal  methods,  [Levi78] 
model  simulation,  [Lee89b] 
operational  analysis, 

quantifying  errors  in  assumptions,  [Beng87] 
petri  net  and  queuing  network  models,  [Chan89] 

state  model  of  computation,  probabilistic  grammar-based  input,  [GilkXX] 
timed  Petri  nets,  [Razo85] 

Perturbation  Testing,  [Zeil81a],  [Zeil81b],  [Zeil83a] 
comparison  with  other  techniques,  [Zeil84] 
for  computation  errors,  [Zeil84] 
for  domain  errors,  [Zeil83b],  [Zeil89] 
integrated  with  mutation  analysis, 
automated  support  for  see  also  EQUATE, 

Petri  Nets,  [Pete77],  [Pete81] 
applications  of, 

analysis  of  concurrent/distributed  systems,  [Cagl82],  Static  Concurrency  Analysis 
performance  analysis,  [Razo85] 
analysis  of  real-time  systems, 
performance  analysis, 
automated  support  for,  [Chan89] 
safety,  recoverability,  fault-tolerance,  [Leve87] 
language  description  (for  Ada  tasking),  [Mand85] 
static  concurrency  analysis, 
for  Ada,  see  Ada,  [Mura89],  [Shat88] 

Process  Programming,  [Oste87] 
applications  of,  [Romb88c] 

generating  information  bases,  [Romb88a],  [Romb89bj 
automated  support  for  see  also  Arcadia 


34 


based  on  software  development  graphs,  [Bjor87] 
specification  language  for,  [Romb88a],  [Romb88c],  [Romb88d] 

Productivity, 

estimation,  [Wals77a] 

using  complexity  metrics,  [Curt79a],  [Curt79b],  [Curt81] 
influencing  factors,  [Chry78],  [Lawr81],  [Vosb84] 
classification  of,  [Vosb84] 
complexity,  [Chen78a] 
human,  [Grif72],  [Love77a] 
saturation  in  team-oriented  development,  [Theb83] 
language  design,  [Bish86],  [Hare82] 

programming/organizational,  [Card87b],  [Duns80],  [Jeff85],  [Sack68] 
issues  of  the  80’s,  [Jone81] 
limits  to,  [Jone79] 

Program  Slicing, 

methods  for,  properties  and  applications  of,  [Weis84] 

Program  Structure, 

classification  of  structure  types,  [Tum80] 
consideration  for, 

ease  of  error  detection,  using  simulation,  [Schn77b] 
error  detection  and  recovery,  [Hom74] 
reliability  prediction,  [Shoo76] 
impact  on, 

complexity  measures,  [Evan83a],  [Evan84c] 
error  detectability,  [Gree76] 
understanding,  [Love77b],  [Wood81b] 
measures  for,  [Gide74] 
stability  measure,  [Soon77],  [Yau79] 
properties  of  “good”  structure,  [Chan73] 

Program  Traces, 
applications  of, 
debugging  Ada  programs, 
trace  database  model,  [LeDo85] 
proving  properties  of  programs,  [Howd78c] 
specification,  [MacL82] 
automated  support  for, 

selective  trace  using  frequency  counts,  [Satt72],  [Satt75] 
symbolic  traces,  [Howd78c] 
value  traces,  [Howd78c] 

Quality, 

correlation  with  testing  effort,  [Tuck65] 
data  sheets,  [Besh85] 
factors  in,  [Press83],  [Walt79] 

software  quality  framework,  [Boeh7 8],  [Bowe85],  [Cava78] 
impact  of, 

structured  programming,  [Bake72a] 
team  design,  [Reit79] 
chief  programmer  teams,  [Bake72aj 
user’s  view,  [Davi85] 

Quality  Assurance,  [Dunn82] 
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automated  support  for,  [Brow73a] 
experience  with,  [Ande88] 
management  and  development  tools,  [Cher80b] 
role  of  programming  environments,  [Cher80a],  [Cher80b] 
based  on  inspections,  [Bark89] 
economics  of, 

impact  of  approaches  on  quality  and  cost,  [Albe76] 
examples  of,  [Krac78] 

planning  for  Ada  development  with  2167,  [Bark89] 
guidelines  and  standards,  [Press83] 

Computer  Society  standard  for  SQA  plans,  [Buck79] 
Defense  System  Software  Quality  Program,  [DODS86] 
DoD  guidelines  and  standards,  [Army84] 

DoD  quality  indicators,  [AFSC86b] 

IEEE  software  quality  metrics  methodology,  [IEEE88] 
IEEE  standard  for  quality  assurance  plans,  [IEEE84] 
RADC  measurement  manual,  [McCa80a],  [McCa80b] 
distributed  systems,  [Bowe83] 
engineering  handbook,  [McWe84] 
example  from  AT&T  Bell  Laboratories,  [Ingl86] 
example  of  telecommunication  requirements,  [Eric85] 
for  embedded  real-time  designs,  [Szul80] 
for  flight  dynamics  software,  [Perr87] 
handbook  for  specification/measurement,  [McCa77a] 
industry  and  government  requirements,  [Land86] 
measures  to  produce  reliable  software,  [IEEE87] 
metrics  standard  (concept  of),  [Sing86] 
operational  testing, 
maintainability,  [AFOT87] 
supportability,  [AJFOT88a],  [AFOT88b] 
usability,  [AFOT82] 

quality  specification  and  evaluation,  [Bowe85] 
survey  of  military  standards  and  metrics,  [Bowe79] 
human  incentives,  [Mizu83] 

in  a  quality  management  program,  [McCa79],  [Walt79] 
practices,  [Brya80],  [Ligh76] 
statistical  quality  control,  [Gran72] 

Quality  Measures,  [Gaff81b]  see  also  Design  Evaluation 
Procedural  Approach  to  the  Evaluation  of  Software, 
Design  and  Management  Indicators,  [Ross88] 
applications  of, 
cost  estimation,  [Ducl82] 
based  on  pattern  recognition  methods,  [McGi77] 
for  distributed  systems,  [Bowe83] 
metrics  for,  [Evan87] 

anomaly  detection,  prediction,  acceptance,  [McCa80a] 
based  on  interconnectivity,  [Kafu81] 
evaluation  and  prediction,  [McCa77b] 
products  and  process,  STARS  metrics,  [Szul84] 
testability  and  testedness,  [Moha76a],  [Moha76b] 
user  satisfaction, 
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automated  support  for,  [Bail83] 
review  of,  [Ives83] 

models  and  metrics  for  management/engineering,  [Basi80a] 
prediction  formulae, 
automated  support  for,  [Amst76] 
utility  of,  [McCa78] 
validation  of,  [Kafu85a] 

Queueing  Analysis, 
applications  of, 

performance  analysis,  [Denn78] 
automated  support  for,  [Chan89] 
for  fault-tolerant  systems,  [Nico87] 

Random  Testing, 

comparison  with  other  techniques,  [Dura81a],  [Haml88],  [Ntaf81a] 
for  fault-tolerant  systems,  [Vouk86a],  [Vouk86b] 
effectiveness  of,  [Vouk85a],  [Vouk85c],  [Vouk86a] 

Recovery  Blocks,  [Hech76] 
automated  support  for,  [Ande76a] 
consensus  recovery  block  method, 
reliability  evaluation, 

data  domain  model  for,  [Scot83a],  [Scot83b],  [Scot87] 
validation  of,  [Scot84a],  [Scot84b] 
for  concurrent  systems, 

sufficient  conditions  for  limiting  rollback,  [Kant80] 
performance  of,  [Wels83] 
data  domain  model  for, 
validation  of,  [Scot84b] 
reliability  evaluation, 

data  domain  model  for,  [Scot83a],  [Scot83b],  [Scot87] 
validation  of,  [Scot84a] 
reliability  model  for,  [Hech76],  [Hech79] 
techniques  for  constructing  acceptance  tests,  [Hech79] 

Regression  Analysis,  [Lee88],  [Leun88] 

0-1  integer  programming,  [Fisc77] 

alternative  retest  philosophies,  [Fisc77] 

automated  support  requirements,  [Panz78a] 

based  on  Cyclomatic  Complexity,  [McCa82a] 

data  structure  for  storing  information,  [Leun88] 

measure  of  tests  affected  by  instruction  change,  [Leun88] 

test  selection,  [Cox81] 

Reliability,  [Bend86],  [Jeli72] 
concepts  and  concerns,  [Rose85b] 

definitions  of,  [Jeli72],  [Musa79bj,  [Shoo77a],  [Weis85b],  [Weis88b] 
designing/implementing  a  reliability  program,  [Rose85b] 
influencing  factors,  [Jeli72] 

Ada  and  FORTRAN,  [Goel88] 
development  practices,  [Card87b] 
effects  of  field  service  on  multisite  software,  [Bake88] 
language  design,  [Bish86],  [Gann75],  [Gann76],  [Gann77] 
investigative  approaches, 
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stepwise  statistical  methodology,  [Troy86] 
issues  in  software  engineering,  [DownSSb] 
principles  and  practices,  [Mora78a],  [Myer76],  [Myer78b] 
relationship  to  hardware  reliability,  [Piku76] 
study  of  radar  system  software  reliability,  [Bowe78] 
theory  of, 

MTSR:  Mathematical  Theory  of  Software  Reliability,  [RADC76a] 
critique  of,  [Haml78b] 
user’s  view,  [Davi85] 

Reliability  Measurement,  [MusaSOa],  [Musa87],  [Thay78] 

Software  Reliability  Measurement  Framework,  [McCa87a],  [McCa87b] 
reliability  and  estimation  studies,  [Goel82] 
applications  of,  [Musa80b],  [Musa87],  [Shoo77c] 
acceptance  testing,  [Thom80] 

determine  continuation/termination  of  testing,  [Thom80] 
supporting  system  engineering,  [Musa79b] 
warranty  provision,  [Thom80] 
comparison  of, 

methods  for  obtaining  confidence  intervals,  [Myhr68] 
during  development,  [Shoo73] 
error  rate  forecasting  methods,  [Nage82],  [Nage84] 
examples  of, 

from  a  space  shuttle  software  project,  [Misr83] 
plan  for  the  Air  Force  ASTROS  project,  [John75] 
execution-time  theory  of,  [Musa79a] 
guidelines  and  standards, 

RADC  guidebook,  [Goel83],  [McCa87b] 
example  of  telecommunication  requirements,  [Eric85] 
management  guidebook,  [Glas79] 
measurement,  estimation  and  prediction,  [Hech77b] 
metrics  for  military  systems,  [Koss88] 

operational  reliability,  [Brow75],  [Litt78],  [Mora75],  [Nels73],  [Nels78],  [Weis85b],  [Weis86],  [Weis88b] 
certifying  from  statistical  testing,  [Curr86],  [Mill87a] 
methods  for  determining  confidence  bounds,  [Dura80] 
supported  by  statistical  sampling,  [Dura81b],  [East72] 
supported  by  statistical  testing,  [Dyer85a] 
others,  [Hech77c],  [Krus78],  [Mora72] 
principles  and  practices,  [Hoer74] 
quantitative  measurement  of,  [Brow76] 
review  of  prediction  methods,  [Misr83] 
state  of  the  art,  [Litt80a],  [Litt80b],  [MiyaXX] 
research  directions,  [Litt78] 
system  reliability,  [Ande79b],  [Han76],  [Land77] 

Bayesian  software-hardware  estimation,  [Thom80] 
comparison  of  hardware/software  reliability,  [Musa80b] 
tutorials,  [Hech80] 

Reliability  Models,  [Dale86],  [Musa80b] 

Poisson  model  for  Markov  and  semi-Markov  structured  software,  [Litt76] 
accounting  for  service  organization  characteristics,  [Bake88] 
analysis  and  validation  of,  [Scha79],  [Wigg84] 

Poisson  and  binomial  models,  [Angu80] 
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continuous  probability  distribution  models,  [Goel80c],  [Schi78] 
discrete  models,  [Broo80b] 

discrete  probability  distribution  models,  [Goel80c],  [Schi78] 
time  and  data  domain  models,  [Goel80c],  [Schi78] 
applications  of,  [Goel83],  [Goel85] 

Markov  models, 
validation,  [Triv80] 

probability-based  model  applied  to  code  reading  experiment,  [Jeli73] 
reliability  growth  models  to  support  management,  [Krug88] 
automated  support  for, 

SMERFS:  Statistical  Modeling  and  Estimation  of  Reliability  Functions  for  Software,  [Farr88] 
classification  of, 

based  on  residual  error  size  and  testing  process,  [Rama82] 
comparison  of,  [Farr83],  [Musa87],  [Suke77a],  [Suke77b],  [Suke79] 
criteria  for,  [Iann84] 

data  requirements,  [Duva80],  [Farr83],  [Litt80b],  [Thay75] 
de-eutrophication  process  model,  [Jeli72] 
deterministic  and  statistical  models,  [Moha79] 
elimination  of  perfect  debugging  assumption,  [Ohba89] 
experiments  in, 

6  models  applied  to  a  C3I  project,  [Angu83] 
failure  rate  assumption, 
relaxation  of,  [Giam86] 
for  prediction, 
analysis  of  quality  of, 

comparison  of  models,  inference  procedures,  [Keil87] 
micro  model  based  on  program  structure,  [Shoo76],  [Shoo77a] 
number  of  errors  at  start  of  testing, 
based  on  development  characteristics,  [Taka89] 
probabilistic  model,  [Shoo72],  [Shoo77a] 
for  probabilistic  program  correctness,  [Dura78] 

supported  by  error  reducing  performance  of  development  processes,  [Dura78] 
growth  models, 

for  project  management,  [Krug88J 
historical  development  of,  [Schi78] 
metrics, 

incorporating  into,  [Henr88b] 
parameter  estimation, 
examples  of, 

application  of  methods  on  a  C3I  project,  [Angu83] 
validation  of  methods,  [Angu80] 
resolving  constraints  from  availability  of  data, 

S-shaped  and  hyperexponential  models,  [Ohba84] 
review  of,  [Farr83],  [Goel83],  [Goel85],  [RADC76a] 
selection  of,  [Abde86],  [Goel83] 
specific  models,  [Litt75] 

Bayesian  differential  debugging  model,  [Litt80c] 

Error  Complexity  Model,  [Naka89] 

Goel-Okumoto  model, 
for  estimation  of  optimal  test  time,  [Goel81] 

Jelinski-Moranda  model,  [Litt81b] 
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comparison  with  other  models,  [Suke79] 

experiments  in,  [Rowl88] 

variations  for  early  error  estimation,  [Mora80] 

Poisson  model  for  Markov  and  semi-Markov  structured  software,  [Litt79] 
S-shaped  reliability  growth  model,  [Yama83] 

Schick- Wolverton  model, 
comparison  with  other  models,  [Suke79] 
modified  Schick- Wolverton  model,  [Suke79] 
data  domain  models, 

for  fault-tolerant  systems,  [Scot83a],  [Scot83b],  [Scot84a],  [Scot84b],  [Scct87] 
Markov  model,  [Sone80] 
for  fault-tolerant  systems,  [Hech79] 
non-homogeneous  Poisson  process,  [Schn75] 
probabilistic  model  and  stopping  rule  for  debugging,  [Form77] 
stochastic  Markov  process  for  hardware/software,  [BeneS5] 
automated  support  for,  [Bene85] 
stochastic  growth  model,  [Litt81a] 

stochastic  model  based  on  a  non-homogeneous  Poisson  process,  [Goel79] 
with  applications,  [Goel80a],  [Goel80b] 
time-based  models, 

Musa’s  execution  time-based  model,  [Musa84] 
application  in  a  computation  center  software,  [Haml78c] 
evaluation  of,  [Mill80c] 

state/time-dependent  failure  rate,  imperfect  debugging, 
binomial  model  for  error  occurrences,  [Shan81] 
maximum  likelihood  estimates  for  parameters,  [Shan81] 
relationship  with  other  models,  [Shan81] 
survey  of,  [Shoo77a] 
time-based  models, 

Bayesian  growth  model,  [Litt73] 

Musa’s  execution  time-based  model,  [Musa75],  [Musa79a] 

Poisson-process  models, 
impact  of  test  process,  [Ehrl87] 

completely  monotonic  regression  estimates,  [Mill85],  [MilI86] 

Reproducible  Testing,  [Weis88a] 
approaches,  [Tai85c],  [Tai86] 
for  host-target  environment,  [Tayl82b] 
for  testing  monitors,  [Brin78] 
automated  support  for  dataflow  machines,  fWahl88] 
for  Ada,  [Tai85b] 

Required  Element  Testing, 
comparison  with  other  techniques,  [Ntaf81a] 
strategies  for,  [Ntaf82] 
evaluation  of,  [Ntaf84] 

Requirements  Analysis, 

SAMM  modeling  tool,  [Lamb78] 

testability  modeling  using  finite  state  machines,  [Chan85] 

Resource  Estimation,  see  Cost  Estimation,  [Bail80] 

Reusable  Libraries, 
software, 

Ada  Software  Repository, 
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metric  analysis  of,  [Leac89] 

Reusable  Libraries, 

hardware, 

for  hardware  verification,  [Bevi88] 
software, 

Ada  Software  Repository,  [Conn87] 

RLF:  Reusability  Library  Framework  project,  [Sold89] 
repository  management,  [Sold89] 

Reuse, 

analysis  of,  [Selb87d],  [Selb88c] 

domain  modeling,  [Sold89] 

for  Ada,  see  Ada,  [Romb88h] 

investigation  of  reuse  and  complexity,  [Basi82d] 

measurement  of  reusability,  [Hess88] 

research  framework  for,  [Basi87d] 

software, 

evaluation  of  life  cycle  models  for,  [Guin89] 

Reuse  Libraries, 

software, 

Moorehouse  object-oriented  reuse  library,  [Jone89] 

Revealing  Subdomains,  [Weyu80c] 

Review  of,  [NBS82a] 

automated  support  tools,  [Perr88],  [Rama75a],  [Reif75] 
cost  estimating, 

conversion  cost  estimating  techniques,  [Hout81] 
models,  [Ducl82] 

formal  functional  specifications  for  modules,  [Lisk79] 
formal  verification,  [Dunn84] 
graph  partitioning  methods,  [Paig77b] 
human  factors, 

psychological  research  on  programming,  [Shei81] 
measurement, 

studies  at  General  Motors,  [Elsh78a] 
metrics, 

Software  Science,  supporting  evidence,  [Fitz78a] 
complexity  metrics,  [Ducl82] 
for  user  information  satisfaction,  [Ives83] 
process/product  measures  for  error-prone  software,  [Shen85] 
quality  metrics,  [Ducl82] 
testing  metrics,  [Perr88] 
reliability, 

models,  [Farr83],  [Goel83],  [Goel85],  [RADC76a] 
prediction  methods,  [Misr83] 
testing  and  analysis  techniques,  [Clar78a],  [Dunn84] 
for  real-time  software,  [Quir85] 
testing  environments,  [Rama75a] 
testing  strategies,  [Dunn84] 

Risk  Analysis,  see  also  FMEA 
DoD  guidelines  and  standards, 
risk  abatement,  [AFSC88b] 
cost/benefit  analysis, 
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using  Bayesian  decision  model,  [Wein80] 

Risk  Redaction  Approaches, 
dual  programming,  [Long77] 
multi-specification,  [Long77] 

Run-Time  Monitoring,  see  also  Instrumentation 
automated  support  for, 

Algol68  numerical  algorithms  testbed,  [Henn76a] 
for  Ada  see  also  Ada, 

for  concurrent/distributed  systems,  EDL  [Yau80] 

Observer,  [Ayac79] 

problems,  practices,  roles,  tools,  [Joyc87b] 
inquiry  language  and  processor,  [Cohe77] 

SADMT,  [Linn88] 
automated  support  for, 

SADMT/SF:  SADMT  Simulation  Facility,  [Linn8S] 
SAGEN  user’s  guide,  [Kapp88] 
example  of  an  architecture  specification,  [Ardo38] 
interface  to  SADMT/SF,  [Linn88] 

SEL,  [Basi77a],  [Basi78a],  [Card82] 

Composite  Specification  Model  (CSM),  [Agre87] 
compendium  of  tools,  [Deck82b] 
cost  estimation,  [McGa84] 
data  collection  and  analysis, 
analyzing  error  data,  [Basi81e] 
automated  support  for,  [Gree81] 
database, 

organization  and  user’s  guide,  [Lo83],  [NASA81] 
procedures  for  the  rehosted  SEL  database,  [Hell87] 
guide  to  data  collection,  [Chur82] 
data  compendium,  [Tum81b] 
glossary  of  software  engineering  terms,  [Babs83] 
operation  of,  [Basi78c] 
recent  studies,  [McGa85a] 
relationship  equations,  [Freb79] 
specification  measures  for,  [Agre84a],  [Agre84c] 

SEL  Comparisons, 

RADC  and  SEL  software  development  data,  [Tum81a] 
resource  utilization  curves,  [Basi81f] 

SELECT  a  symbolic  execution  system,  [Boye75] 

SEL  Evaluations  and  Experiments,  [Card85b] 

Ada,  [Agre86],  [Godf87] 

IV&V  methodology  for  flight  dynamics,  [Page85] 
complexity  measures,  [Basi83b] 
fault  prediction  and  reliability  assessment,  [Basi86e] 
resource  forecasting,  [Basi78a] 

resource  quality  impact  on  product  and  process,  [McGa85b] 
software  development  practices,  [Agre84b],  [Chen81] 
designing  a  measurement  experiment,  [Basi77b] 
impact  of  design  practices,  [Card86a],  [Card86b] 
impact  on  productivity  and  reliability,  [Card87b] 
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lessons  learned,  [Basi85d] 

statistical  model  for  evaluating  effectiveness,  [Card87b] 
statistics  on  errors,  [Basi82a],  [Weis85c] 
software  metrics,  [Basi81g] 
structural  coverage  in  SEL  environment,  [Basi84a] 
study  of  Musa’s  reliability  model,  [Mill80c] 
summary  of  software  measurement  experiences,  [Vale89] 

SEL  Software  Development  Characteristics, 
development  measures,  [Card84],  [Hell87] 
dynamic  variables,  [Basi83d],  [Doer 85] 
evaluation  of  management  measure,  [Page82] 
evaluation  of,  [Basi79c],  [Basi81d],  [Card81] 
relationship  among  development  variables,  [Basi81a],  [Basi85e] 
environment  characteristics,  [Basi79a] 
calculation  and  use  of,  [Basi85a] 
use  and  interpretation,  [Romb85b] 

ST  AD:  System  for  Testing  and  Debngging,  [Kore85],  [Kore86a],  [Kore88] 
YODA:  Your  Own  Ada  Debugger,  [Lask88b] 
trace  database  model,  [LeDo85] 
dependence-based  modeling,  [Kore86b],  [Kore87] 

Safety  Analysis,  see  also  Fault-Tree  Analysis  [Leve83d] 
based  on  constrained  expressions,  see  Constrained  Expressions,  [Leve87] 
evaluation  standards  for  safety  critical  software,  [Pam88] 
issues  and  research  directions,  [Leve86b] 
of  timing  properties  in  real-time  systems, 
based  on  RTL:  Real-Time  Logic,  [Jaha86] 
quantitative  measurement  of  safety,  [Brow76] 

Security  Analysis, 
basic  security  concepts,  [Hall80] 

Security  Verification,  see  also  FDM,  Gypsy,  AFFIRM,  A1  Certification,  HDM 
comparison  of  specification  paradigms,  [Kauf87b] 
comparison  of  techniques,  [Mill81b] 
requirements  for  secure  operating  systems,  [Step74] 
specification/verification  of  o/s  security,  [Feie80],  [Kore84] 

Seif-Checking, 
experiments  in,  [Cha87] 

Simulation, 
applications  of, 

design  and  validation  aid,  [Jack71] 

evaluate  designs/ease  of  error  detection,  [Schn77b] 

performance  evaluation,  [Lee89b] 

testbed  for  cooperative  distributed  problem  solving,  [Less80] 
automated  support  for,  SADMT,  DARTS 
TASKIT:  Tasking  Ada  Simulation  Kit),  [Ange89] 

Sneak  Analysis,  [Godo77] 

Software  Development  Management,  [Daly77] 
control  over  software  engineering  process,  [Dyer80] 
designing/implementing  a  reliability  program,  [Rose85b] 
guidelines  and  standards, 
configuration  management,  [IEEE83c] 
management  indicators,  [AFSC86a] 
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maintenance,  [Adam84] 

relationships  of  strategies  to  repair  maintenance,  [Grad87a] 
state  of  the  art,  [Thay80] 

supported  by,  SEL  Software  Development  Characteristics 
cost  estimation,  [Putn77],  [Putn78],  [Putn79] 
macro  variable  models,  [Gaff 80] 
management  indicators,  [Ross88] 
models  and  metrics  for,  [Basi80a],  [Gaff81a] 
process-related  productivity  factors,  [Vosb84] 
productivity,  performance,  progress  measurement,  [Howe84] 
reliability  growth  models,  [Krug88] 

Software  Development  Practices,  see  also  Chief  Programmer  Teams,  Structured  Programming 

a  rigorous  approach,  [Jone80] 
design, 

evaluation  of  technology  and  practices,  [Brun86] 
overview  of  formal  methods,  [Hoar87] 
parallel  program  design,  [Chan88] 
structured  design,  [Stev74],  [Your76] 
with  constant  evaluation  by,  [Ches77] 
evaluation  of,  see  also  SEL  Evaluations  and  Experiments 
impact  on  understandability  and  modifiability,  [Shep78] 
lessons  learned,  [Basi85d] 
through  application  to  real  projects,  [Snee84] 
problems  and  proposed  solutions,  [Zelk78] 
programming, 
by  action  clusters,  [Naur69] 
programming  style,  [Kem74a],  [Kem74b] 

Software  Development  Process, 
guidelines  and  standards, 

Defense  Systems  Software  Development,  [DOD88] 
life  cycle  approaches, 
evaluation  wrt  reuse,  [Guin89] 
evaluation  wrt  validation  and  verification,  [Guin89] 
iterative  enhancement,  [Basi75] 
cost  model  for,  [Ducl82] 
paradigmatic  approach,  [Walk81] 
risk-driven  approach,  Spiral  Model,  [Boeh86] 
transformational  approach,  [Baue79b],  [Baue89] 

PROSPECTRA  project,  [Krie86] 
model  of  construction/reasoning  errors,  [Howd89a] 
prototyping, 

evaluation  of  software  prototypes,  [Chur86] 
operational  specification  as  a  basis  for,  [Balz82] 
prototyping  versus  specifying,  [Boeh84a] 
uses  of  and  techniques,  [Tayl82a] 
tailoring  and  improving  process  see  also  TAME, 

Software  Physics,  [Hals75a],  [Knij78] 
analysis  of  Akiyama’s  debugging  data,  [Funa75] 
evaluation  of,  [Love76] 
experiments  in,  [Gord76] 

Software  Science,  [Chri81],  [Fitz78a]  see  also  Software  Physics,  [Hals77a],  [Hals78],  [Harr88b],  [Yeh79] 
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APL  and  Halstead’s  theory  of  metrics,  [Deke81] 

Halstead’s  criteria  and  statistical  algorithms,  [Bohr75] 
adaptations  of,  [Bake79a] 
applications  of,  [Smit79] 
compiler  performance  evaluation,  [Shaw89] 

error  prediction  prior  to  testing,  [Com76],  [Otte78],  [Otte79],  [Otte81] 
evaluating  modularity  concepts,  [Bake79b] 

productivity  prediction,  [Come79],  [Curt79a],  [Curt79b],  [Curt81],  [Grem84],  [Hals77d],  [Moha79] 
project  management,  [Hals77b],  [Suno82] 
automated  support  for,  [Otte76] 

comparison  with  other  measures,  [Albr83],  [Bake80],  [Blai85a],  [Gaff79],  [Kitc81],  [Wood81a] 
correlation  with  other  measures,  [Lind89] 
counting  strategies,  [Fits79] 
description  and  example  of,  [Salt82] 

evaluation  of,  [Bake79a],  [Basi81g],  [Fitz78b],  [Hame82],  [Lass81],  [List82],  [Mora78c],  [Shen83] 
relationship  between  estimated/actual  size,  [Card87a] 
relationship  with  development  effort,  [Kitc81],  [Lind89] 
review  of  supporting  evidence,  [Fitz78a],  [Shen83] 
validation  across  FORTRAN  programs,  [Basi83b] 
with  respect  to  cognitive  psychology,  [Coul83] 
example  analysis, 
from  technical  writing,  [Hals77c] 
of  COBOL  programs,  [Shen80] 
of  IBM  programming  products,  [Smit80a] 
of  PL1  programs,  [Elsh76a],  [Zweb79] 
of  programming  size,  [Smit80b] 
real-time  switching  system,  [Bail81] 
experiments  in,  [Come79] 
for  designs,  [Szul81],  [Szul84] 

foundations,  [Hals72a],  [Hals73a],  [Hals73b],  [Hals76] 
influencing  factors, 
basic  constructs,  [Lass79] 
effect  of  the  counting  method,  [Elsh78b] 
vocabulary  effects,  [FitsSO] 
language  level  metric,  [Cont81],  [01de77] 
length  equation,  [John81] 
theory,  [Hals75b] 

Special  Values  Testing, 
comparison  with  other  techniques,  [Howd77c] 

Specification-Based  Testing,  see  also  Constraint  Logic  Programming 
comparison  with  other  techniques,  [Hetz76] 
for  test  data  generation,  [Gour81],  [Gour83],  [Lask88a] 

T-3  Testing  Tool,  [Lawr87] 
state  of  the  art,  [Gour81] 
using  Prolog,  [Boug85a] 

Specification, 
applications  of,  [Pam79] 
transformational  programming,  [Baue89] 
evaluation  of  techniques, 
criteria  for,  [Lisk75] 
formal  methods  of,  [Berg82] 
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review  of  functional  specifications  techniques,  [Lisk79] 
incremental  construction  by  combining  of  parallel  elaborations,  [Feat89] 
research  directions,  [Lisk75] 
role  of,  [Lisk75],  [Lisk79] 
testing,  verification  and  analysis, 
the  LEONARDO  project,  [Gerh88a] 

Specification  Languages,  see  also  Finite  State  Machines,  Abstract  Data  Types,  PSL/PSA 
ASLAN,  [Auer85] 

Lotos,  [Brin87],  [IS087c],  [Najm87] 
executing  Lotos  specifications,  [Bria86] 

PSL/PSA:  Problem  Statement  Language/Problem  Statement  Analyzer, 
modeled  using  axiomatic  methods,  [Gerh84] 

SEMANOL(73),  [Ande76b] 

SEQuEFY  sequence  model  system,  [Gerh88a] 

X  a  computer-based  specification  language,  [Bish86] 
abstract  specifications, 
roles  and  examples  of,  [Pam77] 
algebraic  axioms, 

combined  with  predicate  transformers,  [Gutt80] 
example  from  a  text  editor,  [McMu&3] 
used  as  a  test  driver  see  also  DAISTS, 
algebraic  specifications,  [Ehri85] 

OBJ:  a  language  for  writing  and  testing,  [Gogu79a] 
testing  of,  [Gogu79b] 

theory  and  application  to  testing,  [GaudXX] 
applications  of, 

defining  abstract  models  of  a  system,  [Ches77] 
describing  program  behavior, 
using  time  sequences,  [Dahl79a] 
documenting  hierarchical  design  process,  [Ches77] 
for  monitoring  and  debugging  Ada, 
relational  algebra,  [DiMa85] 
for  oracles,  SEMANOL(73),  [Ande76b] 
runnable  specifications  as  a  design  tool,  [Davi82b] 
testing  communication  protocols,  [Dsso86] 
to  facilitate  proof  of  correctness,  [Noon75] 
automated  support  for,  [Pate89] 
behavioral  abstraction  approach  see  also  EDL, 
desirable  features  of,  [Gogu80] 
distributed  systems, 

EBS:  Event-Based  Specification  Language,  [Chen83] 
for  Ada  see  also  Ada, 
for  data  types, 

final  data  type  specifications,  [Kami80] 
for  hardware,  VHSIC 

HDL:  Hardware  Design  Language,  [Luck86a] 
for  real-time  systems, 

RT- ASLAN,  [Auer86] 
based  on  Lucid,  [Skil89] 
temporal  assertions,  [Lamp83] 
larch  family,  [Gutt85] 


46 


August  9, 1989 


predicate  calculus, 

for  testing  programs  by  specification  mutation,  [Budd85] 
semi-formal  approaches  , 
design  conversations,  [Conk.88] 
role-activity  models,  [Conk86] 
scenarios,  [Wexe87] 

using  traces  to  write  abstract  specifications,  [Bart77] 
a  formal  foundation,  [MacL82] 

Specification  Mutation, 
applications  of, 
for  program  testing,  [Budd85] 

Specification  Testing  and  Analysis,  [Gerr85],  [Prob82b] 
automated  support  for, 

Inatest,  [Eckm84],  [Eckm85],  [Kemm85a] 
concurrent  systems, 

to  detect  synchronization  errors,  [Tai85a] 

dual  specification  comparison  based  on  symbolic  execution,  [Rama81] 

Standards  Checking, 
automated  support  for,  [Henn84] 

Program  Testing  Translator,  [Stuc72] 
for  Ada,  based  on  DIANA  intermediate  form,  [Byrn89] 

Stanford  Pascal  Verifier,  [Luck79a],  [Luck79b] 
survey  of  applications,  [Luck77] 

State  Transition  Models, 
applications  of, 

axiomatic  approaches,  [Gutt77] 

specification/verification  of  communication  protocols,  [Suns77],  [Suns82] 
automated  support  for  see  also  AFFIRM, 

Statement  Testing, 

automated  support  for  see  also  DAISTS, 
comparison  with  other  techniques,  [Selb86] 
procedure  coverage  as  an  alternative,  [Basi84a] 

State  of  the  Art, 

DoD  practices,  [STE86] 

automated  support,  [DeMi87a],  [Reif75] 

Ada  compilation  systems,  [Bend89] 
development  environments,  [Tayl87] 
concepts/research  issues  in  technology,  [Wegn79] 
contributions  of  experiments  to  software  engineering,  [Basi86a] 
data  collection  and  analysis,  [Thib78] 
formal  verification,  [Kemm86],  [Land86],  [Youn89a] 
automated  support  for,  [Crai86],  [Crai87a] 
for  Ada,  [Mayf85],  [Mayf86],  [Roby85] 
measurement,  [Youn89a] 
design  metrics  evolution,  [Romb88f] 
metrics  in  quality  assurance,  [Gaff81b] 
productivity  issues  of  the  80’s,  [Jone81] 

reliability  measurement,  [Bend86],  [Keil87],  [MiyaXX],  [Musa80b] 
software  development  management,  [Thay80] 

testing  and  analysis,  [Budd83b],  [DeMi87a],  [Gerh79],  [Good79a],  [Hans84],  [INF079],  [Land86],  [Mill79a], 
[Youn89a] 
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challenges  to  the  testing  community,  [Mill79c] 
code  reading  and  inspections,  [Faga86] 
data  flow  analysis,  [Clar86a] 

examination  based  on  testing  process  model,  [Gelp88] 
issues,  [Adri80] 
mutation  analysis,  [Lipt78] 
research  directions,  [Howd87] 
specification-based  program  testing,  [Gour81] 

strengths,  weaknesses,  ope.  ational  characteristics,  [Oste80],  [Oste80] 
techniques  for  real-time  software,  [Quir85] 
technology  needs  in  the  80’s,  [Mill79a] 
verification  in  the  80’s,  [Gerh78] 

State  of  the  Practice, 
automated  support,  [DeMi87a] 
programming  problem  areas,  [Elsh76b] 
testing  and  analysis,  [DeMi87a] 

Static  Analysis, 

types  of  errors  found  and  resource  costs,  [Gann79] 

Static  Concurrency  Analysis,  [Saxe77] 

RGA:  Reachability  Graph  Analyzer,  [Morg84],  [Morg86],  [Morg87] 
algorithm  for,  [Tayl81] 
applications  of, 

reconstructing  execution  host/target,  [Tayl82b] 
structural  testing,  [Tayl86a] 
combined  with, 
dynamic  analysis,  [Tayl83c] 
principles  for  automated  support,  [Tayl83c] 
symbolic  execution,  [Youn86a] 
complexity  of,  [Tayl83b] 
for  Ada,  see  Ada,  [Tayl83a] 

syntax-based  synchronization  analysis  with  feasibility  constraints,  [Carv88] 
Statistical  Testing,  [Dyer82a],  [Dyer85a],  [Mill72d] 
certification  of  reliability,  [Curr86],  [Mill87a] 
estimation  of  reliability,  [Dyer85a] 
relationship  to  formal  verification,  [Mill87a] 

Structural  Testing, 

automated  support  for,  DAISTS,  FORTEST  [Mill74a],  [Mill74b] 

FLOW,  part  of  the  PACE  system,  [Brow72a] 
requisite  support  for  concurrent  systems,  [Tayl86a] 
combined  with  functional  testing, 
automated  support  for,  [Clar78a] 
comparison  of  coverage  of  metrics,  [Ntaf85],  [Weis85a] 
comparison  with  other  techniques,  [Howd80c],  [Hwan81] 
fault  detection  effectiveness/cost  faults,  [Basi85b] 
coverage  measures, 

as  indicators  of  system  performance,  [Wu87c] 

based  on  LCSAJs,  [Henn76b] 

definitions,  [Mill80a] 

for  Ada,  [Wi87c] 

hierarchy  of,  [Wood80b] 

statement  and  expression  see  also  DAISTS, 
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evaluation  of, 

error  detection  ability,  [Girg86a] 
relationship  of  coverage/representativeness,  [Brow75] 
selectivity  of  path  selection  criteria,  [Zeil88a] 
exercising  program  segments,  [Popk78] 
for  fault-tolerant  systems,  [Vouk86a],  [Vouk86b] 

Structured  Programming,  [Dahl72] 

Dijkstra’s  calculus  for  formal  program  development,  [Grie76] 
and  complexity, 

formalization  and  application  of,  [McC176] 
measuring  and  controlling  complexity,  [McC178a] 
sources  of  complexity,  [McC178a] 
error-free  programming,  [Mill75d] 
experiments  in,  [Basi81c] 
evaluation  of,  [Broo81],  [John75] 
formal  verification  of,  [Ling79] 
impact  on  quality,  [Bake72a] 
predicting  effect  on  resource  consumption,  [Parr80] 
process  as  well  as  program  structure,  [McC178b] 
theory  of,  [Ling79],  [Mill75a] 

Structured  Testing,  [Wals77c] 
applications  of,  [McCa82a] 

using  Cyclomatic  Complexity,  [McCa76],  [McCa82c],  [Perr88] 

Survey  of, 

automated  support  tools,  [FSTC83],  [Mill77b],  [NBS82b],  [Perr83] 
debugging  tools,  [Schw70a] 
error  analysis  work,  [Amor75] 
error  types,  frequencies  and  habitats,  [Schw70a] 
formal  verification, 
automated  support  tools, 

Stanford  Pascal  Verifier,  applications  of,  [Luck77] 
mechanical  support  for  formal  reasoning,  [Lind88d] 
theorem  provers,  [Elsp72a] 
results  of  Hoare’s  logic  approach,  [Apt81] 
techniques,  [Adri82],  [Elsp72a] 
for  parallel  programs,  [Barr85] 
for  procedure  and  data  abstractions,  [Shan82] 
theory,  [Elsp72a] 
measurement, 

military  standards/metrics  for  quality,  [Bowe79] 
reliability, 

models,  [Rama82],  [Shoo77a] 
technological  management  techniques,  [Glas79] 
testing  and  analysis  techniques,  [Adri82],  [Bils83],  [Mill72c],  [NBS82b] 
dynamic  analysis,  [Howd81c] 
methods  for  estimating  test  data  adequacy,  [Rama82] 
static  analysis,  [Howd81b] 

communication  protocols,  recent  developments,  [Sari88a] 

Symbolic  Execution,  see  also  Symbolic  Testing 
applications  of,  [Clar81a],  [Clar85b] 
debugging, 
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Symbolic  Debug/1000,  [HCP82] 
using  path  rules,  [Brue83] 
fault-based  testing,  [More88] 
symbolic  fault  tracking  see  also  Perturbation  Testing, 
formal  verification,  [Clar84],  [Hant76],  [King76] 
adaptation  of  Manna’s  technique,  [Burs74] 
for  Ada  see  also  Ada, 
of  communication  protocols,  [Bran78] 
prototyping,  [Cohe82] 

testing  and  analysis,  [Chea79],  [Clar76a],  [Clar76b],  [Clar81c],  [Darr78],  [King75a],  [King75b],  [King76], 
[Rich85a] 

combined  with  static  concurrency  analysis,  [Youn86a] 
compiler  testing,  [Same76] 
fault-based  testing,  [More87] 
partition  analysis,  [Rich81c] 
path  generation,  [Clar84] 
goal-oriented  approach,  [Wood80c] 

test  data  generation  methods,  [Chen76],  [Clar76a],  [Clar84],  [Howd77c],  [Rama76] 
automated  support  for,  [Clar81a],  [Clar85bj 
design  of,  [Howd77a] 

automated  tools,  see  also  SELECT,  DISSECT 
EFFIGY  for  PL/1  programs,  [King75a],  [King76] 

Inatest,  [Eckm84],  [Eckm8S],  [Kemm8Sa] 

UNISEX:  a  Unix-based  executor  for  Pascal,  [Eckm83b],  [Kemm85b],  [Soli83] 
for  Ada,  see  Ada,  [Harr88a] 
for  ELI,  [Chea79] 

for  FORTRAN,  [Clar76b],  [Rama76] 

SADAT,  [Voge80] 

an  executor  based  on  MACSYMA,  [Fava79] 
conceptual  representation  for  programs  with  side-effects,  [Hewi76] 
path  selection,  [Wood78] 

strengths,  weaknesses,  operational  characteristics,  [Oste80] 

Symbolic  Testing, 

Lindenmayer  grammars,  [Howd78e] 
automated  support  for  see  also  DISSECT, 
estimation  of  cost  using  available  systems,  [Howd77a] 
comparison  with  other  techniques,  [Howd77a],  [Howd77c] 
for  Ada,  see  Ada,  [Clar86b] 
reliability  of,  [Howd77a],  [Howd77b],  [Howd77c] 

System  Structure, 
cluster  partitioning,  [Hutc83] 
automated  support  for,  [Bela81] 
metric  to  quantify  partition  complexity,  [Bela81] 
quantifying  ratios  of  coupling/cohesion,  [Selb88a],  [Selb88b] 
to  support  error  localization,  [Selb88a],  [Selb88b] 
cost  of  modularization,  [Camp76] 

criteria  for  modularization,  [Card85d],  [Parn72a],  [Parn72b],  [Schn77c] 
based  on  issues  of  fault  tolerance,  [Rand75] 
for  extensible/contractable  software,  [Parn78] 
information  hiding,  [Parn72c] 
monitors  as  a  structuring  method,  [Hoar74] 
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hierarchical  ordering  of  functions/variability,  [Dijk76b] 
evaluation  of, 

based  on  information  flow,  [Henr79],  [Henr81b],  [Henr84] 
experiments  in, 

global  vs  parameterized  module  connections,  [Lohs84] 
relationship  with  maintainability,  [Gibs89] 
meanings  of  the  term  “hierarchical  structure’’,  [Pam74] 
modeling  stabilization  of  a  large  system,  [Hane72] 
relating  rate  of  progress  to,  [Parr80] 
response  times  of  level  structured  systems,  [Hart84] 

System  Testing,  [Perr88] 
aided  by  structured  analysis,  [McCa82b] 
estimating  duration  of,  [Krug88] 
impact  on  reliability  growth  models,  [Ehrl87] 
methods,  [Cele81] 

priority  rules  for  test  case  selection,  [Pets85] 

TAME:  Tailoring  A  Measurement  Environment,  [Basi87a],  [Basi87b],  [Romb88e] 
exploiting  feedback  from  evaluation,  [Basi88] 
improvement-oriented  process  model,  [Romb88b] 
integrating  measurement  into  environments,  [Basi87c] 
lessons  learned  in  the  development  process  and  measurement,  [Romb85a] 
tailoring  process  to  goals,  environments,  [Basi87c],  [Basi88] 

TEAM:  Testing,  Evaluation,  and  Analysis  Medley, 

ARIES:  a  multi-lingual  interpreter,  [Epp86],  [Zeii87] 
design  principles  of,  [Clar88a] 

evaluation  of  testing  and  analysis  techniques,  [Clar88b] 
integration  of  testing  and  analysis  techniques,  [Clar88a] 
model  for,  [Clar88b] 

TSL:  Task  Sequencing  Language,  [Helm85],  [Luck86a] 

TSL-2  for  distributed  systems,  [Luck87] 
testing  and  debugging  of  Ada  programs, 
runtime  monitor,  [Luck87] 

Technology  Transfer,  [Whit88b] 

Temporal  Logic,  [Krog87],  [Lamp83],  [Pnue77] 
applications  of, 

design/synthesis  of  synchronization  skeletons,  [Clar81b] 
verification  of  finite-state  concurrent  systems,  [Clar86d] 
combined  with  Ina  Jo  see  also  FDM 
comparison  of, 

EBS  with  temporal  logic  and  trace  approaches,  [Chen83] 
complexity  of,  [Sist88] 

proof  systems  based  on  temporal  logic,  [Barr84],  [Nguy86],  [Owic82] 
time,  clocks  and  the  ordering  of  events,  [Lamp78] 

Test  Data, 

aid  to  proving  correctness,  [Gell78],  [Howd78b] 

Test  Data  Adequacy,  [Weyu80b] 
completeness  criteria,  [Wals85] 
based  on  ability  to  distinguish  functions,  [Howd80d] 
based  on  testing  complexity,  [Tai80] 
for  concurrent/distributed  systems,  [Weis87] 
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theory  of, 

abstract  definition  of,  [Weyu83] 

axiomatic  theory  of  adequacy,  [Weyu84b],  [Weyu89],  [Zweb89] 
determining  correctness,  [Broo80d] 

rehability,  [Good75a],  [Good75b],  [Haml78a],  [Howd76c],  [Ostr78],  [OstrSO] 

revealing  test  criteria/subdomains,  [WeyuSOc] 

testing  for  probable  correctness,  [Haml86],  [Haml87] 

theoretical  analysis,  [Davi83b] 

two  notions  of  correctness,  [Budd80d] 

Test  Data  Selection, 
for  loop  free  programs,  [Cher 79] 
methodology  for,  [Howd74a],  [Howd76b] 
supported  by, 

analysis  of  memory  dump,  [Ehre76] 
integer  programming,  [Lee88] 
test  case  specifications, 

TESTER/1,  [Pete76] 

Test  Data  Selection  Criteria,  see  also  Mutation  Analysis  [Clar78b],  Structural  Testing,  Data  Flow  Analysis 
for  abstract  code,  DAISTS,  EQUATE,  Symbolic  Fault  Tracking 
syntactic,  semantic,  methodological  problems,  [Zeil88c] 
using  Prolog,  [Boug86] 

Test  Drivers, 

TST:  Ada  Test  Support  Tool,  [Maye89] 
for  Ada,  see  Ada,  [Bess87] 
for  FORTRAN, 

test  procedure  language/processor,  [GE77a],  [GE77b],  [Panz76],  [Panz78a],  [Panz78b],  [Panz78c] 
for  path  testing,  [Shoo79] 
for  pseudo-exhaustive  testing,  [Bagg78] 

Test  Effectiveness,  [Perr83] 
based  on, 

error  reducing  performance  of  development  processes,  [Dura78] 
evaluation  of  test  representativeness,  [Brow75] 
specifications  in  predicate  calculus,  [Budd85] 
estimation  of  residual  faults  and  effectiveness,  [Bowe84] 
formalism  for  completeness  of  error-based  techniques,  [Howd82a] 
measurement  of, 

Algol68  numerical  algorithms  testbed,  [Henn78],  [Henn84] 

Test  Management,  [Evan84b],  [Perr83] 
allocation  and  utilization  of  resources,  [Shen85] 
establishing  comany-wide  metrics  program,  [Grad87b] 
guidelines  and  standards,  [Hetz84] 

Software  Test  and  Evaluation  Manual,  [DODD87] 

Test  and  Evaluation  Master  Plan  guidelines,  [DODD86b] 

Test  and  Evaluation  guidelines,  [Army87],  [DODD86a] 
operational  testing, 
management  guidelines,  [AFOT86] 
software  acquisition  guidance  (maintenance),  [Stan77] 
methodology  for  test  specification  and  auditing,  [Ceri81] 
test  control  process  for  functional  testing,  [Elme69] 
traceability  from  requirements  to  system  test,  [Care77] 

Test  Path  Adequacy,  see  also  Perturbation  Testing 
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measure  for  advantage  of  testing  another  path,  [Zeil81b] 

Test  Path  Generation, 

algorithms  for,  [Han76] 

automated  support  for  see  also  Symbolic  Execution 
complexity  of  algorithms  for  building  a  path,  [Gabo76] 
notions  of  required  pairs/paths,  [Ntaf79],  [Ntaf81b] 

Test  Planning,  [Bran80],  [Perr83] 
based  on  structural  characteristics,  [Moha79] 
based  on  testing  theory,  [Moha79] 
effort  estimation, 

based  on  Goel-Okumoto  reliability  model,  [Goel81] 
based  on  measure  of  testability,  [Moha76b] 
optimum  allocation  of  effort,  [Down85a],  [Down86] 
predicting  error  content  prior  to  testing, 
using  Software  Science,  [Com76],  [Otte78],  [Otte81] 
predicting  errors  content  prior  to  testing, 
using  Software  Science,  [Otte79] 

probabilistic  cost  model  for  optimal  number  of  test  cases,  [Brow89] 
for  systems  testing,  [Perr88] 
guidelines  and  standards,  [Hetz84] 
optimal  testing,  [Mitt82] 
supported  by, 
network  analysis,  [Krau73] 

test  plan  generation  using  formal  grammars,  [Baue79a] 

Testing  Environments,  see  also  Mothra 
Algol68  numerical  algorithms  testbed,  [Henn78],  [Henn84] 

FORTRAN  Automatic  Code  Evaluation  System,  [Rama73],  [Rama74a] 
ISMS  experimental  program  testing  facility,  [Fair75] 

IUTF:  Interactive  Unit  Test  Facility,  [Tsal86] 

Prufstand,  [Snee78] 

architectural  overview  of  a  distributed  testbed,  [Garc83] 
for  Ada  see  also  TEAM, 

ATVS:  Ada  Test  and  Verification  System,  [RADC86] 
knowledge-based , 

for  kernel  system  calls  of  UNIX  systems,  [Pesc85] 
program  testing  assistant,  [Chap82] 
review  of,  [Rama75a] 

Testing  Strategies,  [Dunn84] 
for  expert  systems,  [Hite88] 
for  large,  complex  real-time  systems,  [Ginz65] 
grey  box  testing,  [Prob80],  [Prob82a] 
partition  testing,  [Haml88] 
comparison  with  other  strategies,  [Haml88] 

Theory  of  Programming,  [Dahl72],  [Davi83a],  [Grie81] 

Dijkstra’s  calculus,  [Grie76] 
a  discipline,  [Dijk76a] 
axiomatic  basis  for,  [Hoar69],  [Hoar71a] 
computability  and  unsolvability,  [Davi82a] 
computing  as  a  physical  science,  [Good88] 
convergence,  correctness  and  equivalence, 
of  functional  programs,  [Mann70] 
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equivalence  problem  for  loop-free  programs,  [Ibar82] 
function  semantics  for  sequential  programs,  [Mill80b] 
mathematical  theory  of  computation,  [Mann74] 
model  of  large  program  development,  [Bela76] 
nondeterminism,  [Kenn80] 
notions  of  correctness, 

existential/universal  partial/total  correctness,  [Gall81] 
relationship  between, 

mathematical  proof,  algebraic  languages,  transcendental  numbers,  proof  by  sampling,  [Davi77] 
Theory  of  Testing,  [Howd78d],  [Prat83] 

NP-completeness,  [Gare78] 
applications  of, 

linking  theory  with  practice,  [Mill77a] 
test  planning,  [Moha79] 

concurrent  systems,  [Weis87],  [Weis88a],  [Weis88c] 
extension  of  sequential  methods/theory,  [Weis88a] 
error  propagation  and  elimination,  [More81] 
error-based  testing,  [More84] 
fault-based  testing,  [More87],  [More88] 
investigative  approaches, 

(dis)advantages  to  theoretical/empirical,  [Howd78a] 

Popperian,  [Cher87a] 
abstract,  [Boug85b],  [Cher88] 
as  equivalence  problem,  [Howd78b] 
general  model  for  static  analysis,  [Howd83] 
inductive  inference,  [Cher86],  [Cher87b] 
mathematical  framework,  [Gour81],  [Gour83] 
modeling  the  testing  process,  [Down86] 
to  study  testing/debugging  effectiveness,  [Down85a] 
uniform/nonuniform  execution  models,  [Down86] 
models  of  correct  programs  and  testing,  [Howd74b] 

Tools, 

JAVS:  Jovial  Automated  Verification  System,  [RADC76b] 

NODAL,  [Mait80] 

PACE:  Product  Assurance  Confidence  Evaluator, 
programmer’s  guide,  [Hoff73] 
analysis, 

supported  by  data  management  system,  [John77] 
classification  of,  [Reif79b] 
practical  applications  of,  [Brow72b] 
test  data  generation,  [Bast78],  [Chen75],  [Holt76] 

ATDG:  Automated  Test  Data  Generator  System,  [Hoff75],  [Hoff76] 
for  recursive  programs  having  simple  errors,  [Broo80c] 
supported  by  Prolog,  [Gerh85] 
testing  of,  [Henn79] 

Trace  Analysis, 

for  distributed  systems,  [Jard87] 
communication  protocols,  [Boch88] 
for  conformance  and  arbitration  testing,  [Boch87a] 

Transition  Testing,  [Beiz83] 

Tutorials, 
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models  and  metrics  for  management/engineering,  [Basi80a] 

reliability,  [Hech80] 

testing  and  validation,  [Mill81a] 

validation  and  verification,  [Yeh77] 

User  Interface  Models, 

Chiron  for  software  environments,  [Youn88b] 

VDM:  Vienna  Software  Development  Method,  [Bjor78],  [Bjor82],  [Bjor87] 
example  in  analysis  phase,  [BI0086] 
with  Prolog, 

for  animation  of  programs,  [BI0086] 

for  back-to-back  testing  of  diverse  software,  [BI0086] 

VHSIC, 

analysis  of,  [Luck86b] 

semantics  of  timing  constructs,  [Luck86c] 

Walkthroughs,  see  Code  Reading  and  Inspections,  [Myer78a] 

Weak  Mutation  Analysis,  [Howd82a] 
automated  support  for,  see  also  FORTEST 
error  detection  ability,  [Girg86a] 

Wide- Spectrum  Languages, 
applications  of, 

transformational  programming,  [Baue89] 
basis  for  a  software  development  environment,  [Luck86a] 
for  program  specification  and  development,  [Baue79b] 

ZipFs  Law, 

applications  of, 

estimating  size  and  effort,  [Moha79] 
m-EVES,  [Crai88b],  [Pase87a] 
comparison  with  other  techniques,  [Crai88a] 
example  of  low  water  mark  problem,  [Crai87b] 
m-NEVER  theorem  prover,  [Crai88a],  [Pase87b] 
m- Verdi,  [Crai87c],  [Crai87d],  [Crai88a] 
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4.  ABSTRACTS 

[AFOT86]  Abbreviated  Introduction:  This  pamphlet  is  a  guide  for  the  Air  Force  Operational  Test  and 
Evaluation  Center  (AFOTEC)  Software  Evaluation  Manager  (SEM)  and  Deputy  for  Software  Evaluation 
(DSE).  It  describes  the  numerous  activities  associated  with  planning,  conducting,  analyzing,  and  reporting 
software  operational  test  and  evaluation  (OT&E)  assessments. 

[AFOT87]  Abbreviated  Introduction:  The  purpose  of  this  document  is  to  provide  the  software  evaluator  the 
information  needed  to  conduct  the  Air  Force  Operational  Test  and  Evaluation  Center’s  (AFOTEC’s)  software 
maintainability  evaluation  process.  In  this  document  software  maintainability  is  limited  in  scope  to  software 
design  and  documentation  assessments. 

[AFOT88a]  Abbreviated  Introduction:  This  document  describes  the  method  and  procedures  used  by  the  Air 
Force  Operational  Test  and  Evaluation  Center  (AFOTEC)  for  evaluating  the  software  support  resources  (SSR) 
for  mission  critical  computer  resources  (MCCR)  supportability. 

[AFOT88b]  Abbreviated  Introduction:  The  purpose  of  this  pamphlet  is  to  provide  the  software  evaluation 
manager  and  the  deputy  for  software  evaluation  information  needed  to  evaluate  mission  critical  computer 
software  life  cycle  processes  as  they  influence  software  supportability.  In  this  pamphlet  are  the  means  to  track  the 
processes  affecting  mission  critical  computer  software  supportability,  beginning  as  early  as  necessary  to  provide 
insight  into  the  quality  of  the  evolving  software  products,  software  support  resources,  and  operational  support 
life  cycle  procedures  themselves. 

[AFSC86a]  Abbreviated  Introduction:  This  pamphlet  describes  management  indicators  that  will  provide  visibil¬ 
ity  into  the  acquisition  of  mission-critical  computer  resources.  It  is  intended  to  help  program  managers  by 
presenting  software  management  indicators  that  reflect  the  status  of  software  development  in  an  acquisition  pro¬ 
gram.  It  also  provides  information  that  reflects  experience  on  previous  acquisition  projects.  Indicators  are  just 
that:  indicators.  They  do  not,  nor  are  they  intended  to,  replace  sound  management  practices  and  communica¬ 
tions.  Indicators,  properly  applied,  thoroughly  understood,  and  meticulously  followed-up,  will  lead  the  contrac¬ 
tor  and  program  office  to  those  areas  requiring  management  attention. 

[AFSC86b]  Abbreviated  Introduction:  This  pamphlet  describes  indicators  that  will  provide  insight  into  the  qual¬ 
ity  of  mission-critical  computer  resources.  It  is  intended  to  help  program  managers  by  presenting  indicators  that 
reflect  the  quality  of  the  software  products  developed  in  an  acquisition  program.  It  also  provides  information  that 
reflects  experience  gained  on  previous  acquisition  programs.  Indicators  are  just  that:  indicators.  They  do  not, 
nor  are  intended  to,  replace  sound  quality  practices.  These  indicators,  properly  applied  and  meticulously  fol¬ 
lowed-up,  will  lead  the  contractor  and  program  office  to  those  areas  requiring  additional  quality  attention. 

[AFSC88a]  Overview:  The  purpose  of  this  pamphlet  is  to  help  Program  Directors  (PD)  develop  an  IV&V  pro¬ 
gram  that  meets  their  system’s  specific  requirements.  The  pamphlet  describes  a  six  step  procedure  for  determin¬ 
ing  the  need  for  a  software  IV&V  effort,  establishing  its  scope,  identifying  tasks  and  subtasks  associated  with 
each  IV&V  requirement,  selecting  a  qualified  contractor,  and  estimating  software  IV&V  costs.  In  addition,  this 
pamphlet  integrates  the  software  engineering  tasks  of  D0D-STD-2167A  with  the  software  IV&V  tasks  to  ensure 
value  is  added  to  the  software  development  process  and  product.  The  methods  used  in  this  pamphlet  are  based 
on  a  MIL-STD-882  (System  Safety  Program  Requirements)  approach  as  well  as  a  composite  of  similar  initiatives 
from  Space  Division  (SD),  Aeronautical  Systems  Division  (ASD),  and  Electronic  Systems  Division  (ESD). 

[AFSC88b]  Abbreviated  Overview:  This  pamphlet  describes  software  risk  abatement  processes  composed  of 
risk  identification,  analysis,  and  handling  techniques  that  can  significantly  contribute  to  improving  the  acquisi¬ 
tion  of  mission-critical  computer  resources.  It  is  intended  to  help  program  directors  by  integrating  software  risk 
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abatement  with  system-level  risk  handling  techniques.  Risk  abatement  techniques  can  help  the  contractor  and 
program  office  to  improve  the  performance  and  support  of  the  software  in  weapon  systems. 

[Abde86]  Abstract:  Different  software  reliability  models  can  produce  very  different  answers  when  called  upon  to 
predict  future  reliability  in  a  reliability  growth  context.  Users  need  to  know  which,  if  any,  of  the  competing  pred¬ 
ictions  are  trustworthy.  Some  techniques  are  presented  which  form  the  basis  of  a  partial  solution  to  this  problem. 
Rather  than  attempting  to  decide  which  model  is  generally  best,  the  approach  adopted  here  allows  a  user  to 
decide  upon  the  most  appropriate  model  for  each  application. 

[Acre80]  Abbreviated  Introduction:  Program  testing  has  been  practiced  as  long  as  has  programming  itself,  in 
spite  of  the  general  confession  that  testing  can  never  prove  in  any  absolute  sense  that  a  program  is  correct.  Two 
facts  are  responsible  for  the  popularity  of  testing.  The  first  is  that  testing  has  a  tendency  to  uncover  program 
errors,  and  that  the  more  systematic  the  testing,  the  stronger  this  tendency.  The  second  is  that  a  program  that  is 
not  completely  correct  is  not  necessarily  unreliable  in  a  given  operating  environment,  and  that  even  a  program 
that  is  not  completely  reliable  will  usually  not  be  completely  worthless  to  its  users.  Those  responsible  for 
software  system  development  are  charged  with  deciding  how  much  they  are  willing  to  pay  for  a  given  increase  in 
reliability.  The  challenge  for  research  is  therefore  to  produce  a  testing  method  that  is  (1)  more  effective  at  uncov¬ 
ering  errors  and  (2)  less  expensive  to  apply.  Mutation  analysis  has  been  put  forward  as  such  a  method.  Working 
mutation  systems  have  demonstrated  that  mutation  analysis  can  be  performed  at  an  attractive  cost  on  realistic 
programs.  In  this  work,  the  effectiveness  of  the  method  is  studied  by  experiments  with: 

1.  System  requirements  definition 

2.  System  functional  specifications 

3.  Software  requirements  definition 

4.  Software  functional  specifications 

5.  Software  implementation 

The  mutation  analysis  methodology  examined  in  this  work  has  as  its  goal  validation  of  the  last  stage, 
software  implementation.  As  such  it  overlaps  some  proposed  validation  methods,  and  complements  others.  The 
following  sections  outline  some  of  these  techniques. 

[Adam80]  Abstract:  An  effort  to  automated  the  debugging  of  real  programs  is  presented.  We  discuss  possible 
choices  in  conceiving  a  debugging  system.  In  order  to  detect  all  the  semantic  errors,  it  must  have  knowledge  of 
what  the  program  is  intended  to  achieve.  Strategies  and  results  are  very  dependent  on  the  way  of  giving  this 
knowledge.  In  the  LAURA  system  that  we  have  designed,  the  program’s  task  is  given  by  means  of  a  “program 
model.”  Automatic  debugging  is  then  viewed  as  a  comparison  of  programs.  The  main  characteristics  of  LAURA 
are  the  representation  of  programs  by  graphs,  which  gets  rid  of  many  syntactical  variations,  the  use  of  program 
transformations,  realized  on  the  graphs,  and  its  heuristic  strategy  to  identify  step  by  step  elements  of  the  graphs. 
It  has  been  tested  with  about  a  hundred  programs  written  by  students  to  solve  eight  different  problems  in  various 
fields.  It  is  able  to  recognize  correct  programs  even  if  their  structures  are  very  different  from  the  structure  of  the 
program  model.  It  is  also  able  to  express  exact  diagnostics  of  errors,  or  at  least  to  localize  them.  It  could  be  an 
effective  tool  for  student  programmers. 

[Adri82]  Abstract:  Software  quality  is  achieved  through  the  application  of  development  techniques  and  the  use 
of  verification  procedures  throughout  the  development  process.  Careful  consideration  of  specific  quality  attri¬ 
butes  and  validation  requirements  leads  to  the  selection  of  a  balanced  collection  of  review,  analysis,  and  testing 
techniques  for  use  throughout  the  life  cycle.  This  paper  surveys  current  validation,  verification,  and  testing 
approaches  and  discusses  their  strengths,  weaknesses,  and  life  cycle  usage.  In  conjunction  with  these,  the  paper 
describes  automated  tools  used  to  implement  validation,  verification,  and  testing.  In  the  discussion  of  new 
research  thrusts,  emphasis  is  given  to  the  continued  need  to  develop  a  stronger  theoretical  basis  for  testing  and 
the  need  to  employ  combinations  of  tools  and  techniques  that  may  vary  over  each  application. 

[Albe76]  Abstract:  This  paper  presents  an  examination  into  the  economics  of  software  quality  assurance.  An 
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analysis  of  the  software  life  cycle  is  performed  to  determine  where  in  the  cycle  the  application  of  quality 
assurance  techniques  would  be  most  beneficial.  The  number  and  types  of  errors  occurring  at  various  phases  of 
the  software  life-cycle  are  estimated.  A  variety  of  approaches  in  increasing  software  quality  (including  Structured 
Programming,  Top  Down  Design,  Programmer  Management  Techniques  and  Automated  Tools)  are  reviewed 
and  their  potential  impact  on  quality  and  costs  are  examined. 

[AJbr83]  Abstract:  One  of  the  most  important  problems  faced  by  software  developers  and  users  is  the  prediction 
of  the  size  of  a  programming  system  and  its  development  effort.  As  an  alternative  to  “size,”  one  might  deal  with 
a  measure  of  the  “function”  that  the  software  is  to  perform.  Albrecht  has  developed  a  methodology  to  estimate 
the  amount  of  the  “function”  the  software  is  to  perform,  in  terms  of  the  data  it  is  to  use  (absorb)  and  to  generate 
(produce).  The  “function”  is  quantified  as  “function  points,”  essentially,  a  weighted  sum  of  the  number  of 
“inputs,”  “outputs,”  “master  files,”  and  “inquiries”  provided  to,  or  generated  by,  the  software.  This  paper 
demonstrates  the  equivalence  between  Albrecht’s  external  input/output  data  flow  representative  of  a  program 
(the  “function  points”  metric)  and  Halstead’s  “software  science”  or  “software  linguistics”  model  of  a  program 
as  well  as  the  “soft  content”  variation  of  Halstead’s  model  suggested  by  Gaffney. 

Further,  the  degrees  of  correlation  between  “function  points”  and  the  eventual  “SLOC”  (source  lines  of 
code)  of  the  program,  and  between  “function  points”  and  the  work-effort  required  to  develop  the  code,  is 
demonstrated.  The  “function  point”  measure  is  thought  to  be  more  useful  than  “SLOC”  as  a  prediction  of  work 
effort  because  “function  points”  are  relatively  easily  estimated  from  a  statement  of  basic  requirements  for  a  pro¬ 
gram  early  in  the  development  cycle. 

The  strong  degree  of  equivalency  between  “function  points”  and  “SLOC”  shown  in  the  paper  suggests  a 
two-step  work-effort  validation  procedure,  first  using  “function  points”  to  estimate  “SLOC,”  and  then  using 
“SLOC”  to  estimate  the  work-effort.  This  approach  would  provide  validation  of  application  development  work 
plans  and  work-effort  estimates  early  in  the  development  cycle.  The  approach  would  also  more  effectively  use  the 
existing  base  of  knowledge  on  producing  “ SLOC ”  until  a  similar  base  is  developed  for  “function  points.” 

The  paper  assumes  that  the  reader  is  familiar  with  the  fundamental  theory  of  “software  science”  measure¬ 
ments  and  the  practice  of  validating  estimates  of  work-effort  to  design  and  implement  software  applications  (pro¬ 
grams).  If  not,  a  review  of  [cited  references]  is  suggested. 

[Alle74]  Abstract:  The  data  relationships  which  exist  between  the  procedures  in  a  program  are  of  interest  in  pro¬ 
gram  analysis  and  optimization.  In  this  paper  an  analysis  algorithm  is  given  which  determines  the  interprocedural 
data  flow  relationships  which  exist  in  a  collection  of  procedures.  The  context  of  the  analysis  is  a  static  (compile 
time)  analysis  of  procedures  within  a  high  level  language.  It  assumes  that  the  collection  obeys  certain  constraints, 
the  most  serious  of  which  is  that  the  procedures  cannot  be  recursive.  While  many  practical  considerations  are 
not  addressed  in  this  paper,  the  basic  practical  constraint  that  each  procedure  in  the  collection  be  analyzed  only 
once  is  satisfied.  Existing  results  in  intraprocedural  data  flow  analysis  form  the  basis  for  the  algorithm. 

[Alle76]  Abstract:  The  global  data  relationships  in  a  program  can  be  exposed  and  codified  by  the  static  analysis 
methods  described  in  this  paper.  A  procedure  is  given  which  determines  all  the  definitions  which  can  possibly 
“reach”  each  node  of  the  control  flow  graph  of  the  program  and  all  the  definitions  that  are  “live”  on  each  edge  of 
the  graph.  The  procedure  uses  an  “interval”  ordered  edge  listing  data  structure  and  handles  reducible  and  irredu¬ 
cible  graphs  indistinguishably. 

[Ambi76a]  Abstract:  An  introduction  to  the  Gypsy  programming  and  specification  language  is  given.  Gypsy  is  a 
high-level  programming  language  with  facilities  for  general  programming  and  also  for  systems  programming  that 
is  oriented  toward  communications  processing.  This  includes  facilities  for  concurrent  processes  and  process  syn¬ 
chronization.  Gypsy  also  contains  facilities  for  detecting  and  processing  errors  that  are  due  to  the  actual  running 
of  the  program  in  an  imperfect  environment.  The  specification  facilities  give  a  precise  way  of  expressing  the 
desired  properties  of  the  Gypsy  programs.  All  of  the  features  of  Gypsy  are  fully  verifiable,  either  by  formal  proof 
or  by  validation  at  run  time.  An  overview  of  the  language  design  and  a  detailed  example  program  are  given. 
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[Amor75]  Abstract:  This  report  presents  preliminary  results  of  a  study  in  the  area  of  error  classification.  A  gen¬ 
eral  method  of  error  classification  is  described  which  is  designed  to  serve  as  a  guideline  for  experiment-specific 
applications.  A  survey  of  error  classification  and  analysis  work,  both  in  the  general  literature  and  at  MITRE,  as 
well  as  a  study  of  error  experiment  design  considerations,  are  reflected  in  the  discussion  and  conclusions. 

[Amst76]  Abstract:  The  purpose  of  the  experiment  was  to  see  if  it  was  possible  to  provide  a  useful  tool  to  aid  in 
the  improvement  and  automatic  measurement  of  the  quality  of  computer  programs.  Such  a  tool  was  wanted 
because  of  the  large  number  (over  1000)  of  programs  involved,  and  because  of  a  desire  to  obtain  quantitative 
measures  of  quality.  There  were  five  subjective  quality  definitions  considered.  The  first  two  dealt  with  the  extent 
to  which  reduction  in  object  code  could  be  made  via  simple  transformations  or  a  complete  restructuring  of  the 
program.  The  next  two  consisted  of  the  extent  to  which  reductions  in  the  number  of  source  statements  could  be 
made  via  simple  transformations  or  a  complete  restructuring  of  the  program.  The  last  was  the  ranked  clarity  of 
the  program  source.  The  principal  method  used  was  the  manual  grading  of  a  sample,  extraction  of  quantifiable 
independent  variables,  and  the  use  of  regression  analysis  to  derive  prediction  formulas.  Results  included  some 
tentative  quality  prediction  formulas,  correlations  among  the  independent  and  dependent  variables  (e.g., 
GOTO’s  and  clarity),  observations  about  programming,  and  a  host  of  newly-generated  questions.  There  are 
indications  that  for  two  large  program  populations,  we  have  derived  a  useful  tool  for  automatically  differentiating 
between  good  and  bad  programs. 

[Ande76a]  Abstract:  The  need  for  reliable  complex  systems  motivates  the  development  of  techniques  by  which 
acceptable  service  can  be  maintained,  even  in  the  presence  of  residual  errors.  Recovery  blocks  allow  a  software 
designer  to  include  tests  on  the  acceptability  of  the  various  phases  of  a  system’s  operation,  and  to  specify  alterna¬ 
tive  actions  should  the  acceptance  tests  fail.  This  approach  relies  on  certain  architectural  features,  ideally  imple¬ 
mented  in  hardware,  by  which  control  and  data  structures  can  be  retrieved  after  errors. 

A  brief  account  is  presented  of  the  recovery  block  scheme,  together  with  a  description  of  a  new  imple¬ 
mentation  of  the  underlying  cache  mechanism.  The  salient  features  of  a  proposed  computer  architecture  are 
described,  which  incorporates  this  implementation  and  also  provides  a  high  level  detection  for  errors  such  as  the 
corruption  of  code  and  data.  A  prototype  system  has  been  constructed  to  test  the  viability  of  these  techniques  by 
executing  programs  containing  recovery  blocks  on  an  emulator  for  the  proposed  architecture.  Experiences  in 
running  this  system  are  recounted  with  respect  to  the  execution  of  program  based  on  erroneous  algorithms  and 
also  with  respect  to  errors  introduced  by  deliberate  attempts  to  corrupt  the  system. 

[Ande76b]  Summary:  SEMANOL  is  a  practical  programming  system  for  writing  readable  formal  specifications 
of  the  syntax  and  semantics  of  programming  languages.  SEMANOL  is  based  on  a  theory  of  semantics  which 
embraces  algorithmic  (operational)  and  extensional  (input/output)  semantics.  Specifications  for  large  contem¬ 
porary  languages  have  been  constructed  in  the  formal  language,  SEMANOL  (73),  which  is  a  readable  high-level 
notation.  A  SEMANOL  (73)  specification  can  be  executed  (by  an  existing  interpreter  program);  when  given  a 
program  from  the  specified  language,  and  its  input,  the  execution  of  the  SEMANOL  (73)  provides  important 
practical  advantages.  This  paper  includes  discussions  of  the  theory  of  semantics  underlying  SEMANOL,  the  syn¬ 
tax  and  semantics  of  the  SEMANOL  (73)  language,  the  use  of  the  SEMANOL  (73)  language  in  the  SEMANOL 
method  for  describing  programming  languages,  and  the  contrast  between  the  Vienna  definition  method  (VDL) 
and  SEMANOL. 

[Ande79a]  Table  of  Contents:  Mathematical  Induction.  Proving  the  correctness  of  flowchart  programs,  basic 
principles  of  proving  flowchart  programs  correct,  the  inductive  assertion  method,  and  lormaliaing  inductive 
assertion  proofs.  Proving  the  correctness  of  programs  written  in  a  standard  programming  language,  examples  for 
Fortran  and  PL/I.  Proving  the  correctness  of  recursive  programs  by  using  structural  induction.  Current  research 
related  to  proving  program  correctness.  References. 

[Ande83]  Abstract:  Real-time  systems  often  have  very  high  reliability  requirements  and  are  therefore  prime  can¬ 
didates  for  the  inclusion  of  fault  tolerance  techniques.  In  order  to  provide  tolerance  to  software  faults,  some 
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form  of  state  restoration  is  usually  advocated  as  a  means  of  recovery.  State  restoration  can  be  expensive  and  the 
cost  is  exacerbated  for  systems  which  utilize  concurrent  processes.  The  concurrency  present  in  most  real-time 
systems  and  the  further  difficulties  introduced  by  timing  constraints  suggest  that  providing  tolerance  for  software 
faults  may  be  inordinately  expensive  or  complex.  We  believe  that  this  need  not  be  the  case,  and  propose  a 
straightforward  pragmatic  approach  to  software  fault  tolerance  which  is  believed  to  be  applicable  to  many  real¬ 
time  systems.  The  approach  takes  advantage  of  the  structure  of  real-time  systems  to  simplify  error  recovery,  and 
a  classification  scheme  for  errors  is  introduced.  Responses  to  each  type  of  error  are  proposed  which  allow  ser¬ 
vice  to  be  maintained . 

[Ande85]  Abstract:  In  order  to  assess  the  effectiveness  of  software  fault-tolerance  techniques  for  enhancing  the 
reliability  of  practical  systems,  a  major  experimental  project  has  been  conducted  at  the  University  of  Newcastle 
upon  Tyne.  Techniques  were  developed  for,  and  applied  to,  a  realistic  implementation  of  a  real-time  system  (a 
naval  command  and  control  system).  Reliability  data  were  collected  by  operating  this  system  in  a  simulated  tacti¬ 
cal  environment  for  a  variety  of  action  scenarios.  This  paper  provides  an  overview  of  the  project  and  presents  the 
results  of  three  phases  of  experimentation.  An  analysis  of  these  results  shows  that  use  of  the  software  fault  toler¬ 
ance  approach  yielded  a  substantial  improvement  in  the  reliability  of  the  command  and  control  system. 

[Ande88]  Abstract:  Analysis  of  the  WIS  Ada  source  code  involved  applying  an  automated,  hierarchical,  Ada- 
specific  software  metrics  framework  to  approximately  200,000  lines  of  Air  Force-supplied  Ada  source.  The  pur¬ 
pose  of  the  analysis  was  to  aid  the  Air  Force  in  identification  of  the  characteristics  of  the  code  that  detract 
unnecessarily  from  reliability,  maintainability,  and  portability.  The  software  was  analyzed  during  the  initial  phase 
of  code  development  to  insure  that  sufficient  time  would  be  allotted  for  the  elimination  of  undesired  characteris¬ 
tics. 

DRC’s  Ada  metrics  framework  measures  three  software  factors,  six  software  criteria,  and  150  software 
metric  elements,  where  each  metric  element  relates  a  software  quality  principle  to  the  use  of  specific  features  of 
the  Ada  language. 

The  analysis  of  the  Air  Force-supplied  Ada  source  involved: 

1.  automated  calculation  of  metric  scores  for  the  supplied  source, 

2.  human  analysis  of  the  metric  scores  to  determine  those  characteristics  that  augment  or  attenuate  quality  and  to 
formulate  recommendations  on  how  to  enhance  quality, 

3.  modification  of  two  modules  of  the  supplied  source  to  illustrate  the  impact  of  [the  authors]  recommendations, 
and 

4.  reporting  of  the  findings  to  the  Air  Force. 

[Andr81]  Abstract:  This  paper  describes  an  automated  testing  methodology  and  an  experiment  performed  to 
determine  its  effectiveness.  The  method  is  to  insert  in  the  program  to  be  tested  a  number  of  “executable  asser¬ 
tions,”  statements  about  the  program  that  trigger  error  signals  whenever  they  are  evaluated  to  be  false  (violated). 
A  testcase  is  then  developed  for  the  program  using  actual  values  of  the  input  variables.  When  the  program  is  run, 
a  plot  is  generated  of  the  number  assertions  violated  versus  the  input  variable  values  used.  The  resulting  function 
is  called  the  “error  function.”  Heuristic  search  algorithms  can  then  be  used  to  maximize  this  function  and 
thereby  automatically  locate  input  values  which  cause  the  most  errors  to  occur.  The  experiment  included 
developing  assertions  for  the  program  to  be  tested,  choosing  and  inserting  representative  errors  into  the  pro¬ 
gram,  and  implementing  search  and  data  collection  algorithms  for  testing.  The  results  indicate  that  combining 
executable  assertions  with  heuristic  search  algorithms  is  an  effective  method  for  automating  the  testing  of  com¬ 
puter  programs. 

[Angl83]  Abstract:  There  has  been  a  great  deal  of  theoretical  and  experimental  work  in  computer  science  on 
inductive  inference  systems,  that  is  systems  that  try  to  infer  general  rules  from  examples.  However,  a  complete 
and  applicable  theory  of  such  systems  is  still  a  distant  goal.  This  survey  highlights  and  explains  the  main  ideas 
that  have  been  developed  in  the  study  of  inductive  inference,  with  special  emphasis  on  the  relations  between  the 
general  theory  and  the  specific  algorithms  and  implementation. 
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[Angu80]  Abstract:  The  purpose  of  this  study  was  the  validation  of  existing  software  (S/W)  reliability  models. 
This  validation  was  accomplished  by  investigating  the  properties  of  model  parameter  estimates,  by  investigating 
the  validity  of  model  internal  assumptions,  and  by  analyzing  the  goodness-to-fit  of  the  models.  These  investiga¬ 
tions  were  all  made  in  terms  of  actual  S/W  error  data  for  sixteen  (16)  electronic  system  computer  programs 
which  represented  a  wide  variety  of  system  types. 

The  types  of  S/W  reliability  models  studied  were  basically  two:  Poisson  and  Binomial  models.  The 
methods  of  parameter  estimation  investigated  were  also  two:  the  maximum  likelihood  method  and  the  least 
squares  method. 

[Angu83]  Abstract:  The  objective  of  this  study  was  to  demonstrate  the  use  and  applicability  to  Air  Force 
software  acquisition  managers  of  six  quantitative  software  reliability  models  to  a  major  command,  control,  com¬ 
munications,  and  intelligence  (C3I)  system.  The  scope  of  the  effort  involved  the  collection  of  software  error  data 
from  an  ongoing  C3I  project,  fitting  the  six  models  to  the  data  thus  collected,  analysis  of  the  predictions  provided 
by  the  models,  and  the  development  of  conclusions,  recommendations,  and  guidelines  for  software  acquisition 
managers  pertaining  to  the  use  and  applicability  of  the  six  software  reliability  models. 

[Appe88]  Abstract:  Mutation  analysis  is  a  method  for  software  testing  in  which  many  slightly  differing  versions 
of  a  program  are  executed  on  the  same  test  data;  the  end  result  is  a  measure  of  the  data’s  quality.  Over  the  last  ten 
years,  several  mutation-based  systems  have  demonstrated  the  usefulness  of  mutation  analysis  for  software  test¬ 
ing.  This  paper  shows  how  mutation  analysis  can  be  a  useful  tool  for  testing  large  scale  Ada  software  systems.  We 
first  sketch  the  general  theory  of  mutation  analysis,  then  show  how  to  apply  it  to  Ada.  We  then  discuss  some  of 
the  significant  new  problems  that  Ada  poses  for  mutation  analysis.  Finally,  we  describe  a  prototype  mul¬ 
tilanguage  mutation  system  that  allows  the  testing  of  both  Fortran  77  and  Ada  programs. 

[Apt80]  Abstract:  An  axiomatic  proof  system  is  presented  for  proving  partial  correctness  and  absence  of 
eadlock  (and  failure)  of  communicating  sequential  processes.  The  key  (meta)  rule  introduces  cooperation 
_etween  proofs,  a  new  concept  needed  to  deal  with  proofs  about  synchronization  by  message  passing.  CSP’s 
new  convention  for  distributed  termination  of  loops  is  dealt  with.  Applications  of  the  method  involve  correct¬ 
ness  proofs  for  two  algorithms,  one  for  distributed  partitioning  of  sets,  the  other  for  distributed  computation  of 
the  greatest  common  divisor  of  n  numbers. 

[Apt81]  Abstract:  A  survey  of  various  results  concerning  Hoare’s  approach  to  proving  partial  and  total  correct¬ 
ness  of  programs  is  presented.  Emphasis  is  placed  on  the  soundness  and  completeness  issues.  Various  proof  sys¬ 
tems  for  while  programs,  recursive  procedures,  local  variable  declarations,  and  procedures  with  parameters, 
together  with  the  corresponding  soundness,  completeness,  and  incompleteness  results,  are  discussed. 

[Apt83a]  Abstract:  In  a  previous  paper  a  proof  system  dealing  with  partial  correctness  of  communicating 
sequential  processes  was  introduced.  Soundness  and  relative  completeness  of  this  system  are  proved  here.  It  is 
also  indicated  in  what  way  the  semantics  and  the  proof  system  can  be  extended  to  deal  with  the  total  correctness 
of  the  programs. 

[Ardo88]  Abstract:  IDA  Paper  P-2036  presents  a  simple  architecture  specification  in  the  SDI  Architecture 
Dataflow  Modeling  Technique  (SADMT).  The  example  code  is  given  in  the  SADMT  Generator  (SAGEN) 
Language.  This  simple  architecture  includes  (1)  an  informal  description  of  the  architecture,  (2)  the  main  pro¬ 
gram  that  creates  the  components  of  the  simulation,  (3)  the  specification  of  the  BM/C3  logical  processes  of  the 
architecture,  (4)  the  specification  of  the  Technology  Modules  (TMs)  of  the  architecture,  and  (5)  the  specification 
of  the  BM/C3  and  the  TMs  of  the  threat. 

[Army84]  Preface:  This  Software  Quality  Engineering  Handbook  was  developed  by  the  USA  Army  Computer 
Systems  Command,  Quality  Assurance  Directorate.  It  describes  techniques  for  establishing  quality  goals  for  a 
software  project,  applying  those  goals  during  software  development,  and  evaluating  fulfillment  of  those  goals. 
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[Army87]  Overview:  The  Software  Test  and  Evaluation  Manual  is  a  three  volume  reference  set  that  provides 
checklists  and  guidance  to  Department  of  Defense  components  in  the  area  of  software  test  and  evaluation  for 
major  systems  through  improved  acquisition  management  and  risk  reduction  procedures.  This  manual  addresses 
the  structuring,  planning,  conduct,  and  evaluation  of  software  tests  throughout  the  acquisition  process.  Volume 
n  is  intended  for  use  by  the  Service  Headquarters,  Development  Commands,  Program  Offices  and  Contractors, 
Development  Test  Agencies,  and  Operational  Test  Agencies. 

[Artb88]  Abbreviated  Introduction:  Software  maintenance  is  a  complex  and  costly  activity.  Such  activities  signi¬ 
ficantly  outweigh  developmental  costs  and  are  estimated  to  consume  more  than  half  of  the  total  life  cycle  cost  of 
a  system.  Factors  contributing  to  this  substantial  burden  include: 

•  the  demand  for  and  shortage  of  maintenance  personnel  who  possess  the  necessary  skills  demanded  by 
maintenance  activities, 

•  the  lack  of  complementary  methods  of  techniques  for  performing  maintenance  activities,  and 

•  the  scarcity  of  tools  for  supporting  activities  intrinsic  to  the  maintenance  of  complex  software  system. 

In  an  effort  to  control  the  complexity  and  costs  associated  with  maintenance  activities,  a  group  of  software 
engineers  at  the  Naval  Surface  Warfare  Center  (NSWC)  in  Dhalgren,  Virginia  has  developed  the  Automated 
Design  Description  System  (ADDS)  that  supports  maintenance  activities  through  the  use  of  reverse  engineering 
techniques.  This  report  examines  ADDS  relative  to  current  technologies,  and  discusses  the  strengths  and 
weaknesses  of  the  system  as  a  tool  for  supporting  software  maintenance. 

The  Automated  Design  Description  System  employs  reverse  engineering  concepts  to  produce  specialized 
documents  tailored  for  the  maintenance  activity.  Many  studies  tout  the  need  for  and  benefits  of  documentation 
during  software  maintenance.  Other  authors  have  enunciated  the  principle  of  concurrent  documentation  that 
reflects  current  system  design  and  development  status.  Although  recognized  as  a  necessity,  the  maintenance  of  a 
consistently  high  level  of  documentation  throughout  the  software  life  cycle  is  rarely  achieved.  Consequently, 
software  engineers  have  sought  methods  and  tools  to  compensate  for  human  inadequacies  in  documentation. 
Slowly,  reverse  engineering  concepts  have  emerged,  as  have  tools  such  as  ADDS. 

[Auer86]  Abstract:  RT- ASLAN  is  a  formal  language  for  specifying  real-time  systems.  It  is  an  extension  of  the 
ASLAN  specification  language  for  sequential  systems.  Some  of  the  features  of  the  ASLAN  language,  such  as 
constructs  for  writing  procedural  semantics  in  a  nonprocedural  logical  language,  are  highlighted.  The 
RT-ASLAN  language  supports  specification  of  parallel  real-time  processes  through  arbitrary  levels  of  abstrac¬ 
tion;  processes  do  not  have  to  be  specified  to  the  same  level  of  detail.  Communicating  processes  use  an  interface 
process  as  an  abstract  data  type  representing  shared  information.  From  RT-ASLAN  specifications,  perfor-  I 
mance  correctness  conjectures  are  generated.  These  conjectures  are  logic  statements  whose  proof  guarantees 
the  specification  meets  critical  time  bounds.  A  detailed  example  as  well  as  a  discussion  of  the  advantages  and 
disadvantages  of  formal  specification  and  verification  are  included. 

[Aviz75]  Abstract:  Two  complementary  methods  which  are  employed  in  order  to  assure  reliable  computing  are 
fault-intolerance  and  fault-tolerance.  Fault-intolerance  depends  on  the  elimination  of  the  causes  of  unreliability 
prior  to  the  start  of  the  computing  process  while  fault-tolerance  employs  protective  redundancy  during  the  com¬ 
puting  process  in  order  to  detect  and  to  correct  unreliable  functioning.  A  balanced  allocation  of  reliability 
resources  between  the  two  methods  appears  to  offer  the  best  practical  solution.  The  paper  reviews  current  fault- 
tolerance  practices  in  system  architecture  and  discusses  their  relevance  to  software  systems. 

[Aviz77]  Abstract:  N-version  programming  results  in  N  independently  generated,  but  functionally  equivalent 
programs  which  are  intended  to  provide  fault-tolerance  for  software  faults  during  program  execution.  A  pilot 
experiment  in  N-version  programming  is  described  and  an  evolving  methodology  for  this  form  of  programming  is 
outlined. 

[Aviz84]  Abbreviated  Introduction:  Fault  tolerance  is  the  survival  attribute  of  computer  architectures;  when  a 
system  is  able  to  recover  automatically  from  fault-caused  errors,  and  eliminate  faults  without  suffering  an 
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externally  perceivable  failure,  the  system  is  said  to  be  fault  tolerant.  Originally,  fault-tolerant  architectures  were 
developed  to  tolerate  physical  faults  that  occur  because  of  random  failure  phenomena  in  the  hardware  of  a  sys¬ 
tem.  More  recently,  the  tolerance  of  design  faults,  especially  in  software,  has  been  added  to  the  objectives  of 
fault  tolerance. 

Design  diversity  is  the  approach  in  which  the  hardware  and  software  elements  that  are  to  be  used  for  multi¬ 
ple  computations  are  not  copies,  but  are  independently  designed  to  meet  a  system’s  requirements.  Different 
designers  and  design  tools  are  employed  in  each  effort,  and  commonalities  are  systematically  avoided.  The  obvi¬ 
ous  advantage  of  design  diversity  is  that  reliable  computing  does  not  require  the  complete  absence  of  design 
faults,  but  only  that  those  faults  not  produce  similar  errors  in  a  majority  of  the  designs.  This  and  other  advan¬ 
tages,  as  well  as  some  limitations  of  design  diversity,  are  discussed  in  this  article. 

[Aviz85]  Abstract:  Evolution  of  the  N-version  software  approach  to  the  tolerance  of  design  faults  is  reviewed. 
Principal  requirements  for  the  implementation  of  N-version  software  are  summarized  and  the  DEDIX  distri¬ 
buted  supervisor  and  testbed  for  the  execution  of  N-version  software  is  described.  Goals  of  current  research  are 
presented  and  some  potential  benefits  of  the  N-version  approach  are  identified. 

[Aviz87]  Overview:  The  Advanced  Automation  System,  or  AAS,  will  provide  automation  services  to  both 
en-route  and  terminal  air  traffic  controllers  throughout  the  United  States.  Although  controllers  are  able  to  main¬ 
tain  separation  between  aircraft  during  periods  of  interruption  of  the  automatic  services,  the  transition  to  backup 
modes  of  operation  is  potentially  hazardous.  The  increased  controller  workload  resulting  from  interruption  of 
the  services  provided  to  controllers  limits  the  traffic  handling  capability  of  the  Air  Traffic  Control  system,  which 
can  result  in  major  delays  during  periods  of  heavy  traffic.  As  the  level  of  automation  services  provided  to  con¬ 
trollers  increases,  interruption  of  computer  services  to  the  controllers  will  become  even  more  critical.  Accord¬ 
ingly,  extremely  high  reliability  and  availability  of  the  services  provided  by  the  AAS  will  be  required  on  a  continu¬ 
ous  basis,  24  hours  a  day,  seven  days  a  week. 

In  this  article,  we  will  discuss  only  the  main  element  of  the  AAS,  which  is  the  area  control  computer 
coupler,  or  ACCC.  The  approach  used  to  define  the  ACCC  requirements  illustrates  the  approach  used  for  the 
other  computer  complexes  as  well. 

[Avru85]  Abstract:  In  this  paper  we  outline  an  approach  to  describing  and  analyzing  designs  for  distributed 
software  systems.  A  descriptive  notation  is  introduced,  and  analysis  techniques  applicable  to  designs  expressed 
in  that  notation  are  presented.  The  usefulness  of  the  approach  is  illustrated  by  applying  it  to  a  realistic  distributed 
software-system  design  problem  involving  mutual  exclusion  in  a  computer  network. 

[Avru86]  Abstract:  We  describe  an  approach  to  the  design  of  concurrent  software  systems  based  on  the  con¬ 
strained  expression  formalism.  This  formalism  provides  a  rigorous  conceptual  method  for  the  semantics  of  con¬ 
current  compilations,  thereby  supporting  analysis  of  important  system  properties  as  part  of  the  design  process. 
At  the  same  time,  [the  authors]  approach  allows  designers  to  use  standard  specification  and  design  languages, 
rather  than  forcing  them  to  deal  with  the  formal  model  explicitly  or  directly.  As  a  result,  [the  authors]  approach 
attains  the  benefits  of  formal  rigor  without  the  associated  pain  of  unnatural  concepts  or  notations  for  its  users. 

The  conceptual  model  of  concurrency  underlying  the  constrained  expression  formalism  treats  the  collec¬ 
tion  of  possible  behaviors  of  a  concurrent  system  as  a  set  of  sequences  of  events.  The  constrained  expression  for¬ 
malism  provides  a  useful  closed-form  description  of  these  sequences.  We  have  developed  algorithms  for  translat¬ 
ing  designs  expressed  in  a  wide  variety  of  notations  into  these  constrained  expressions.  We  have  also  developed  a 
number  of  powerful  analysis  tools  that  can  be  applied  to  these  descriptions. 

In  this  paper,  we  describe  the  constrained  expression  formalism  and  these  analysis  techniques.  We 
describe  the  way  this  approach  would  be  used  in  design,  giving  an  example  illustrating  its  use  in  conjunction  with 
an  Ada-like  design  language,  and  discuss  present  and  future  prospects  for  its  automation  and  use. 

[Bagg80]  Abstract:  In  the  past,  most  software  tests  were  constructed  by  heuristics  and  by  drawing  upon  experi¬ 
ence  with  similar  software.  Recently,  enough  preliminary  work  has  been  done  to  propose  an  analytical 
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construction  of  test  cases. 

This  report  begins  by  defining  five  broad  classes  of  software  tests:  Type  O,  Type  1,  Type  2,  Type  3  and 
Type  4.  In  a  Type  O  test,  all  instructions  are  exercised  at  least  once.  In  a  Type  1  and  2  test,  all  flowchart  paths  are 
exercised  at  least  once.  Type  1  is  performed  by  forced  traversal  and  Type  2  by  natural  execution.  Types  3  and  4 
are  unfeasible  and  only  a  strategy  lying  between  Type  1  and  2  can  effectively  be  implemented. 

Since  enumeration  of  all  the  paths  in  a  given  program  is  required  for  Type  1  and  2  tests,  this  report  estab¬ 
lishes  the  lower  and  upper  bounds  on  the  number  of  paths  as  a  function  of  the  number  of  deciders,  describes  a 
manual  decomposition  procedure  to  cut  a  graph  into  smaller  subgraphs,  and  proposes  an  algorithm  to  machine- 
identify  all  paths.  A  complete  Type  1.5  driver  system  for  forced  path  traversal,  implemented  in  PL/1,  is  then 
thoroughly  described,  together  with  suggestions  on  how  to  extend  these  techniques  to  other  languages. 

A  typical  program  is  analyzed  manually,  tested  with  data  and  run  through  the  system.  Some  evaluation  of 
the  usefulness  of  the  system  is  eventually  given  in  the  light  of  the  accumulated  experience. 

[Baia84]  Abstract:  The  structure  of  a  complier  for  the  ECSP  language  is  described.  ECSP  is  a  concurrent 
language  extending  Hoare’s  CSP:  it  supports  dynamic  communication  channels  and  nested  processes.  The  com¬ 
pilation  of  ECPS  programs  is  obtained  by  the  composition  of  several  tools  of  minimal  functionalities. 

A  set  of  static  checks  on  interactions  between  concurrent  processes  is  described.  The  checks  verify  the 
mutual  consistency  of  the  interfaces  of  processes:  an  interface  is  given  by  a  set  of  input/output  channels  con¬ 
necting  a  process  to  its  partners.  It  is  shown  that  the  amount  of  the  coverage  of  checks  depend  on  the  entities 
referred  to  in  interprocess  communication  constructs  and  that  both  increase  with  the  adoption  of  explicit  nam¬ 
ing. 

The  checks  on  process  interfaces  are  carried  on  in  several  tool  of  the  complier  ffont-end  to  achieve 
machine  independence.  To  support  separate  complication,  each  tool  can  be  applied  to  a  subset  of  the  processes 
of  the  program. 

[Baia85]  Abstract:  This  work  discusses  some  issues  in  the  debugging  of  concurrent  programs.  A  set  of  desirable 
characteristics  of  a  debugger  for  concurrent  languages  is  deduced  from  an  examination  of  the  differences 
between  the  debugging  of  concurrent  programs  and  that  of  sequential  ones.  A  debugger  for  a  concurrent 
language,  derived  from  CSP,  is  then  presented.  It  is  based  upon  a  semantic  model  of  the  supported  language. 
The  debugger  enables  [us]  to  compare  a  description  of  the  program  behavior  to  the  actual  behavior  as  well  as  to 
evaluate  assertions  on  the  process  state.  The  description  of  the  behavior  is  given  by  a  formalism  whose  semantics 
is  also  specified.  The  formalism  can  specify  program  behaviors  at  various  abstraction  levels.  Lastly,  some  guide¬ 
lines  for  the  implementation  of  the  debugger  are  shown  and  a  detailed  example  of  program  description  is 
analyzed. 

[Bail80]  Abstract:  One  of  the  basic  goals  of  software  engineering  is  the  establishment  of  useful  models  and  equa¬ 
tions  to  predict  the  cost  of  any  given  programming  project.  Many  models  have  been  proposed  over  the  last 
several  years,  but,  because  of  differences  in  the  data  collected,  types  of  projects  and  environmental  factors 
among  software  development  sites,  these  models  are  not  transportable  and  are  only  valid  within  the  organization 
where  they  were  developed.  This  result  seems  reasonable  when  one  considers  that  a  model  developed  at  a  certain 
environment  will  only  be  able  to  capture  the  impact  of  the  factors  which  have  a  variable  effect  within  that 
environment.  Those  factors  which  are  constant  at  that  environment,  and  therefore  do  not  cause  variations  in  the 
productivity  among  projects  produced  there,  may  have  a  different  or  variable  effects  at  another  environment. 

This  paper  presents  a  model-generation  process  which  permits  the  development  of  a  resource  estimation 
model  for  any  particular  organization.  The  model  is  based  on  data  collected  by  that  organization  which  captures 
its  particular  environmental  factors  and  the  differences  among  its  particular  projects.  The  process  provides  the 
capability  of  producing  a  model  tailored  to  the  organization  which  can  be  expected  to  be  more  effective  than  any 
model  originally  developed  for  another  environment.  It  is  demonstrated  here  using  data  collected  from  the 
Software  Engineering  Laboratory  at  the  NASA/Goddard  Space  Flight  Center. 

[Bail81]  Abstract:  This  paper  describes  an  application  of  Maurice  Halstead’s  software  theory  to  a  real  time 
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switching  system.  The  Halstead  metrics  and  the  software  tool  developed  for  computing  them  are  discussed. 
Analysis  of  the  metric  data  indicates  that  the  level  of  the  switching  language  was  not  constant  across  algorithms 
and  that  software  error  data  was  not  a  linear  function  of  volume. 

[Bake72a]  Introduction:  Experience  in  development  and  maintenance  of  large  computer-based  systems  for 
government  and  industry  has  led  the  IBM  Federal  Systems  Division  to  the  formulation  of  a  new  approach  to  pro¬ 
duction  programming.  This  approach,  which  couples  a  new  kind  of  programming  organization  (a  Chief  Program¬ 
mer  Team)  with  formal  tools  for  using  structured  programming  in  system  development,  was  recently  applied  on  a 
contract  with  The  New  York  Times  for  an  online  information  system.  Compared  to  experience  on  similar  con¬ 
tracts  in  the  past,  the  approach  resulted  in  increased  programmer  productivity  coupled  with  improved  quality. 
An  earlier  paper  describes  the  approach  in  detail  and  gives  productivity  measures  in  a  form  which  should  allow 
comparability  to  other  systems.  Following  a  brief  description  of  the  system  and  a  review  of  the  approach,  this 
paper  discusses  the  quality  of  the  system  as  observed  during  a  thorough  acceptance  test  and  in  the  initial  period 
of  operation  following  its  delivery. 

[Bake72b]  Seeking  to  demonstrate  increased  programmer  productivity,  a  functional  organization  of  specialists 
led  by  a  chief  programmer  has  combined  and  applied  known  techniques  into  a  unified  methodology. 

Combined  are  a  program  production  library,  general-to-detail  implementation,  and  structured  program¬ 
ming.  The  overall  methodology  has  been  applied  to  an  information  storage  and  retrieval  system. 

Experimental  results  suggest  significantly  increased  productivity  and  decreased  system  integration  difficul¬ 
ties. 

[Bake79b]  Abstract:  An  investigation  is  made  into  the  extent  to  which  relationships  from  software  science  are 
useful  in  analyzing  programming  methodology  principles  that  are  concerned  with  modularity.  Using  previously 
published  data  from  over  500  programs,  it  is  shown  that  the  software  science  effort  measure  provides  quantitative 
answers  to  questions  concerning  the  conditions  under  which  modularization  is  beneficial.  Among  the  issues  dis¬ 
cussed  are  the  reduction  of  similar  code  sequences  by  temporary  variable  and  subprogram  definition,  and  the  use 
of  global  variables.  Using  data  flow  analysis,  environmental  considerations  which  affect  the  applicability  of  alter¬ 
native  modularity  techniques  are  also  discussed. 

The  results  obtained  using  software  science  are  compared  with  certain  generally  accepted  methodologies 
involving  modularity,  and  show  strong  agreement.  Finally,  the  results  suggest  some  areas  of  potential  improve¬ 
ment  in  the  technique  used  to  obtain  the  software  science  measurements. 

[Bake80]  Abstract:  In  attempting  to  describe  the  quality  of  computer  software,  one  of  the  more  frequently  men¬ 
tioned  measurable  attributes  is  complexity  of  the  flow  of  control.  During  the  past  several  years,  there  have  been 
many  attempts  to  quantify  this  aspect  of  computer  programs,  approaching  the  problem  from  such  diverse  points 
of  view  as  graph  theory  and  software  science.  Most  notable  measures  in  these  areas  are  McCabe’s  cyclomatic 
complexity  and  Halstead’s  software  effort.  More  recently,  Woodward  et  al.,  have  proposed  a  complexity  measure 
based  on  the  number  of  crossings,  or  “knots,”  of  arcs  in  a  linearization  of  the  flowgraph. 

Focusing  on  these  three  quantities,  we  establish  their  major  properties  as  measures  of  control  flow  com¬ 
plexity.  Particular  attention  is  directed  at  the  behavior  of  the  measures  in  structured  programming  environments, 
including  the  effect  of  various  structuring  transformations  on  the  measures. 

As  a  result  of  this  investigation,  weaknesses  of  each  of  the  measures  taken  individually  are  exposed.  How¬ 
ever,  the  software  effort  and  cyclomatic  complexity  measures  appear  to  have  disjoint  areas  of  weakness.  This 
suggests  that  more  comprehensive  measures  of  control  flow  complexity  can  be  motivated  by  consideration  of 
combinations  of  these  basic  measures. 

[Bake88]  Abstract:  The  reliability  of  a  program,  when  many  copies  are  run  in  a  multisite  environment  with  the 
support  of  a  software  service  organization,  depends  upon  the  inherent  reliability  of  the  program  and  certain 
characteristics  of  the  service  organization.  In  this  paper  we  identify  a  small  number  of  parameters  that  determine 
the  relevant  characteristics  of  the  service  organization,  and  analyze  their  effects  upon  the  reliability  of  the 
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program  as  it  is  experienced  at  an  average  site.  This  is  done  with  two  software  reliability  models,  a  first  discovery 
model  and  a  total  (defect)  discovery  model. 

[Balz69]  With  the  advent  of  the  higher-level  algebraic  languages,  the  computer  industry  expected  to  be  relieved 
of  the  detailed  programming  required  at  the  assembly-language  level.  This  expectation  has  largely  been  realized. 
Many  systems  are  now  being  built  in  higher-level  languages  (most  notably  MULTICS). 

However,  our  ability  to  debug  programs  has  not  advanced  much  with  our  increased  use  of  the  higher-level 
languages. 

We  have,  in  general,  merely  copied  the  on-line  assembly-language  debugging  aids,  rather  than  design 
totally  new  facilities  for  higher-level  languages.  We  have  neither  created  new  graphical  formats  in  which  to 
present  the  debugging  information,  not  provided  a  reasonable  means  by  which  users  can  specify  the  processing 
required  on  any  available  debugging  data. 

EXDAMS  (Extendable  Debugging  And  Monitoring  System)  is  an  attempt  to  break  this  impasse  by  pro¬ 
viding  a  single  environment  in  which  the  users  can  easily  add  new  on-line  debugging  aids  to  the  system 
one-at-a-time  without  further  modifying  the  source-level  compilers,  EXDAMS,  or  their  programs  to  be 
debugged. 

[Barr82]  Abstract:  An  axiomatic  proof  system  is  developed  for  use  in  proving  partial  correctness  and  absence  of 
deadlock  in  Ada  tasks.  Axioms  for  the  Ada  tasking  primitives  in  isolation  are  presented,  and  then  rules  pro¬ 
posed  that  describe  the  logical  interaction  of  tasks  through  the  rendezvous  mechanism.  These  axioms  and  rules 
are  then  used  to  present  partial  correctness  proofs  of  parallel-processing  examples  written  in  Ada.  The  system  is 
extended  to  deal  with  questions  of  blocking  and  detection  of  deadlock  and,  finally,  the  problem  of  task  termina¬ 
tion  and  exception  handling  are  discussed. 

[Barr84]  Abstract:  A  compositional  temporal  logic  proof  system  for  the  specification  and  verification  of  con¬ 
current  programs  is  presented.  Versions  of  the  system  are  developed  for  shared  variables  and  communication 
based  programming  languages  that  include  procedures. 

[Barr85]  Abstract:  The  book  summarizes  and  review  nine  verification  techniques  for  parallel  programs,  four  of 
which  are  based  on  shared  variables  (viz.,  the  methods  proposed  by  Flon  and  Suzuki,  Jones,  Lamport,  Owicki 
and  Gries)  while  the  other  five  are  based  on  message  passing  (Apt,  Fancez  and  de  Roever,  Barringer  and  Meams, 
Levin  and  Gries,  Misra  and  Chandy,  Zhou  and  Hoare). 

The  following  scheme  has  been  assumed  for  the  presentation  of  each  method: 

1.  a  list  of  major  references  to  the  work; 

2.  an  overview  of  the  method; 

3.  a  summary  of  ihe  examples  the  method  was  applied  to  in  the  major  references; 

4.  a  detailed  exposition  of  the  application  of  the  approach  to  some  different  examples; 

5.  general  comments  on  the  method; 

6.  a  summary  of  the  proof  system. 

Mainly  the  set  partition  problem  and  the  bubble -lattice  sort  program  are  used  as  test  problems  in  the  worked 
examples. 

For  newcomers,  the  book  may  perhaps  be  of  some  value  as  a  first  overview  and  guide  to  the  literature.  For 
experts,  a  much  more  detailed  assessment  and  comparison  of  the  methods  reviewed  would  certainly  have  been 
desirable,  (the  conclusions  section  which  compares  the  problems  and  benefits  of  the  methods  reviewed  is  only 
one  and  a  half  pages  long!). 

[Bart78]  Abstract:  A  new  interprocedural  data  flow  analysis  algorithm  is  presented  and  analyzed.  The  algorithm 
associates  with  each  procedure  in  a  program  information  about  which  variables  may  be  modified,  which  may  be 
used,  and  which  are  possibly  preserved  by  a  call  on  the  procedure,  and  all  of  its  subcalls.  The  algorithm  is  suffi¬ 
ciently  powerful  to  be  used  on  recursive  programs  and  to  deal  with  the  sharing  of  variables  which  arises  through 
reference  parameters.  The  algorithm  is  unique  in  that  it  can  compute  all  of  this  information  in  a  single  pass,  not 
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requiring  a  prepass  to  compute  calling  relationships  or  sharing  patterns.  The  algorithm  is  asymptotically  optimal 
in  time  complexity.  It  has  been  implemented  and  is  practical  even  on  programs  which  are  quite  large. 

IBasi75]  Abstract:  This  paper  recommends  the  “iterative  enhancement”  technique  as  a  practical  means  of  using 
a  top-down,  stepwise  refinement  approach  to  software  development.  This  technique  begins  with  a  simple  initial 
implementation  of  a  properly  chosen  (skeletal)  subproject  which  is  followed  by  the  gradual  enhancement  of  suc¬ 
cessive  implementations  in  order  to  build  the  full  implementation.  The  development  and  quantitative  analysis  of 
a  production  compiler  for  the  language  SIMPL-T  is  used  to  demonstrate  that  the  application  of  iterative 
enhancement  to  software  development  is  practical  and  efficient,  encourages  the  generation  of  an  easily  modifi¬ 
able  product,  and  facilitates  reliability. 

[Basi78a]  Abstract:  The  collection  and  analysis  of  data  from  programming  projects  is  necessary  for  the 
appropriate  evaluation  of  software  engineering  methodologies.  Towards  this  end,  the  Software  Engineering 
Laboratory  was  organized  between  the  University  of  Maryland  and  NASA  Goddard  Space  Flight  Center.  This 
paper  describes  the  structure  of  the  Laboratory  and  provides  some  data  on  project  evaluation  from  some  of  the 
early  projects  that  have  been  monitored.  The  analysis  relates  to  resource  forecasting  using  a  model  of  the  project 
life  cycle  based  upon  the  Rayleigh  equation  and  to  error  rates  applying  ideas  developed  by  Belady  and  Lehman. 

[Basi79a]  Abstract:  There  is  a  need  for  a  distinguishing  set  of  useful  automatable  measures  of  the  software 
development  process  and  product.  Measures  are  considered  useful  if  they  are  sensitive  to  externally  observable 
differences  in  development  environments  and  their  relative  values  correspond  to  some  intuition  regarding  these 
characteristic  differences.  Such  measures  could  provide  an  objective  quantitative  foundation  for  constructing 
quality  assurance  standards  and  for  calibrating  mathematical  models  of  software  reliability  and  resource  estima¬ 
tion.  This  paper  presents  a  set  of  automatable  measures  that  were  implemented,  evaluated  in  a  controlled  experi¬ 
ment,  and  found  to  satisfy  these  usefulness  criteria.  The  measures  include  computer  job  steps,  program 
exchanges,  program  size,  and  cyclomatic  complexity. 

[Basi79b]  Abstract:  The  effects  of  human  factors  on  “high-level”  software  properties-too  intangible  to  quantify 
directly-can  be  inferred  from  the  collective  behavior  of  related  “low-level”  aspects. 

[Basi80b]  Abstract:  A  family  of  structural  complexity  metrics  which  contains  a  number  of  current  metrics  is 
developed.  The  family  may  be  used  to  give  a  framework  for  experimental  analysis  of  metrics.  By  implementing 
the  family  or  suitable  subfamily  as  an  automatic  metric  tool,  may  metrics  become  readily  available  and  may  even 
be  merged  to  form  new  metrics  in  response  to  information  obtained  during  exploratory  analysis. 

[Basi81a]  Abstract:  This  paper  presents  an  attempt  to  examine  a  set  of  basic  relationships  among  various 
software  development  variables,  such  as  size,  effort,  project  duration,  staff  size,  and  productivity.  These  vari¬ 
ables  are  plotted  against  each  other  for  15  Software  Engineering  Laboratory  projects  that  were  developed  for 
NASA/Goddard  Space  Flight  Center  by  Computer  Sciences  Corp.  Certain  relationships  are  derived  in  the  form 
of  equations,  and  these  equations  are  compared  with  a  set  derived  by  Walston  and  Felix  for  IBM  Federal  Systems 
Division  project  data.  Although  the  equations  do  not  have  the  same  coefficients,  they  are  seen  to  have  similar 
exponents.  In  fact,  the  Software  Engineering  Laboratory  equations  tend  to  be  within  one  standard  error  of  esti¬ 
mate  of  the  IBM  equations. 

[Basi81b]  Abstract:  We  describe  in  this  paper  an  effective  data  collection  method  for  evaluating  software 
development  methodologies,  from  definition  of  the  objectives  of  the  data  collection  to  analysis  of  the  results.  We 
show  how  the  data  analysis  can  answer  questions  with  respect  to  how  successfully  the  goals  of  the  development 
methodology  are  met.  The  A-7  requirements  document  is  used  as  an  example.  We  provide  the  results  of  data  ana¬ 
lyses  conducted  partway  through  the  A-7  flight  software  development  cycle,  and  discuss  the  utility  of  information 
obtained  by  such  partial  analyses.  Results  from  the  study  show  that  data  collection  is  feasible  and  useful  when 
performed  as  part  of  configuration  control,  that  data  distributions  based  on  partial  data  provide  useful  feedback 
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to  the  developers,  and  that  the  A-7  Requirements  Document  is  easily  maintained  and  changed. 

[Basi81c]  Abstract:  A  software  engineering  research  study  has  been  undertaken  to  empirically  analyze  and  com¬ 
pare  various  software  development  approaches;  its  fundamental  features  and  initial  findings  are  presented  in  this 
paper.  An  experiment  was  designed  and  conducted  to  confirm  certain  suppositions  concerning  the  beneficial 
effects  of  a  particular  disciplined  methodology  for  software  development.  The  disciplined  methodology  consisted 
of  programming  teams  employing  certain  techniques  and  organizations  commonly  defined  under  the  umbrella 
term  structured  programming.  Other  programming  teams  and  individual  programmers  both  served  as  control 
groups  for  comparison.  The  experimentally  tested  hypotheses  involved  a  number  of  quantitative,  objective, 
unobtrusive,  and  automatable  measures  of  programming  aspects  dealing  with  the  software  development  process 
and  the  developed  software  product.  The  experiment’s  results  revealed  several  programming  aspects  for  which 
statistically  significant  differences  existed  between  the  disciplined  methodology  and  the  control  groups.  The 
results  were  interpreted  as  confirmation  of  the  original  suppositions  and  evidence  in  favor  of  the  disciplined 
methodology.  This  paper  describes  the  specific  features  of  the  experiment;  outlines  the  investigative  approach 
used  to  plan,  execute,  and  analyze  it;  reports  its  immediate  results;  and  interprets  them  according  to  intuitions 
regarding  the  disciplined  methodology. 

[Basi81f]  Abstract:  This  paper  analyzes  the  resource  utilization  curve  developed  by  Parr.  The  curve  is  compared 
with  several  other  curves,  including  the  Rayleigh  curve,  a  parabola,  and  a  trapezoid,  with  respect  to  how  well 
they  fit  manpower  utilization.  The  evaluation  is  performed  for  several  projects  developed  in  the  Software 
Engineering  Laboratory  of  the  6-12  man-year  variety.  The  conclusion  drawn  is  that  the  Parr  curve  can  be  made  to 
fit  the  data  better  than  the  other  curves.  However,  because  of  the  noise  in  the  data,  it  is  difficult  to  confirm  the 
shape  of  the  manpower  distribution  from  the  data  alone  and  therefore  difficult  to  validate  any  particular  model. 
Also,  since  the  parameters  used  in  the  curve  are  not  easily  calculable  or  estimable  from  known  data,  the  curve  is 
not  effective  for  resource  estimation. 

[Basi8Ig]  Abbreviated  Introduction:  Among  the  most  popular  metrics  have  been  the  software  science  metrics 
of  Halstead,  and  the  cyclomatic  complexity  metric  of  McCabe.  One  question  is  whether  these  metrics  actually 
measure  such  things  as  effort  and  complexity.  One  measure  of  effort  may  be  the  time  required  to  produce  a  pro¬ 
duct.  One  measure  of  complexity  might  be  the  number  of  errors  made  during  the  development  of  a  product.  A 
second  question  is  how  these  metrics  compare  with  standard  size  measures,  such  as  the  number  of  source  lines 
or  the  number  of  executable  statements,  i.e.,  do  they  do  a  better  job  of  predicting  the  effort  or  the  number  of 
errors?  Lastly,  how  do  these  metrics  related  to  each  other? 

One  simple  way  of  checking  the  relationship  between  errors  or  effort  and  the  various  metrics  is  to  examine 
the  plots  of  variables  against  one  another  and  correlations  between  the  various  variables.  This  provides  us  with  a 
first  look  at  attempting  to  shed  some  light  on  the  questions  posed  and  the  relationships  that  may  hold. 

[Basi82a]  Abstract:  An  effective  data  collection  methodology  for  evaluating  software  development  methodolo¬ 
gies  was  applied  to  four  different  software  development  projects.  Goals  of  the  data  collection  included  character¬ 
izing  changes  and  errors,  characterizing  projects  and  programmers,  identifying  effective  error  detection  and 
correction  techniques,  and  investigating  ripple  effects. 

The  data  collected  consisted  of  changes  (including  error  corrections)  made  to  the  software  after  code  was 
written  and  baselined,  but  before  testing  began.  Data  collection  and  validation  were  concurrent  with  software 
development.  Changes  reported  were  verified  by  interviews  with  programmers.  Analysis  of  the  data  showed  pat¬ 
terns  that  were  used  in  satisfying  the  goals  of  the  data  collection.  Some  of  the  results  are  summarized  in  the  fol¬ 
lowing:  1.  Error  corrections  aside,  the  most  frequent  type  of  change  was  an  unplanned  design  modification.  2. 
The  most  common  type  of  error  was  one  made  in  the  design  or  implementation  of  a  single  component  of  the  sys¬ 
tem.  Incorrect  requirements  and  misunderstandings  of  functional  specifications,  interfaces,  support  software 
and  hardware,  and  languages  and  compilers  were  generally  not  significant  sources  of  errors.  3.  Despite  a  signifi¬ 
cant  number  of  requirements  changes  imposed  on  some  projects,  there  was  no  corresponding  increase  in  fre¬ 
quency  of  requirements  misunderstandings.  4.  More  than  75%  of  all  changes  took  a  day  or  less  to  make.  5. 
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Changes  tended  to  be  nonlocalized  with  respect  to  individual  components  but  localized  with  respect  to  the  sub¬ 
systems.  6.  Relatively  few  changes  resulted  in  errors.  Relatively  few  errors  required  more  than  one  attempt  at 
correction.  7.  Most  errors  were  detected  by  executing  the  program.  The  cause  of  most  errors  was  found  by  read¬ 
ing  code.  Support  facilities  and  techniques  such  as  traces,  dumps,  cross-reference  and  attribute  listings,  and  pro¬ 
gram  proving  were  rarely  used. 

[Basi82b]  Overview:  In  this  newsletter,  we  briefly  describe  [the  authors]  approach  to  data  collection  followed  by 
a  description  of  the  software  development  project  that  we  are  monitoring.  We  then  address  several  central  issues 
related  to  the  use  of  Ada  in  the  design  phase  of  this  project. 

[Basi82c]  Abstract:  An  effective  data  collection  method  for  evaluating  software  development  methodologies 
and  for  studying  the  software  development  process  is  described.  The  method  uses  goal-directed  data  collection 
to  evaluate  methodologies  with  respect  to  the  claims  made  for  them.  Such  claims  are  used  as  a  basis  for  defining 
the  goals  of  the  data  collection,  establishing  a  list  of  questions  of  interest  to  be  answered  by  data  analysis,  defin¬ 
ing  a  set  of  data  categorization  schemes,  and  designing  a  data  collection  form. 

The  data  to  be  collected  are  based  on  the  changes  made  to  the  software  during  development,  and  are 
obtained  when  the  changes  are  made.  To  ensure  accuracy  of  the  data,  validation  is  performed  concurrently  with 
software  development  and  data  collection.  Validation  is  based  on  interviews  with  those  people  supplying  the 
data.  Results  from  using  the  methodology  show  that  data  validation  is  a  necessary  part  of  change  data  collection. 
Without  it,  as  much  as  50  percent  of  the  data  may  be  erroneous. 

Feasibility  of  the  data  collection  methodology  was  demonstrated  by  applying  it  to  five  different  projects  in 
two  different  environments.  The  application  showed  that  the  methodology  was  both  feasible  and  useful. 

[Basi82d]  Introduction:  The  identification  of  the  various  factors  that  have  an  effect  on  software  development  is 
of  prime  concern  to  software  engineers.  The  specific  focus  of  this  paper  is  to  analyze  the  relationships  between 
the  frequency  and  distribution  of  errors  during  software  development,  the  maintenance  of  the  developed 
software,  and  a  variety  of  environmental  factors.  These  factors  include  the  complexity  of  the  software,  the 
developer’s  experience  with  the  application,  and  the  reuse  of  existing  design  and  code.  Such  relationships  can 
provide  an  insight  into  the  characteristics  of  computer  software  and  the  effects  that  an  environment  can  have  on 
the  software  product.  Such  relationships  can  also  improve  the  reliability  and  quality  with  respect  to  computer 
software.  In  an  effort  to  acquire  knowledge  of  these  basic  relationships,  change  data  for  a  medium-scale  software 
project  were  analyzed.  (Change  data  include  any  documentation  that  reports  an  alteration  made  to  the  software 
for  a  particular  reason.) 

The  overall  objectives  of  this  paper  are  threefold:  first,  to  report  the  results  of  the  analyses;  second,  to 
review  the  results  in  the  context  of  those  reported  by  other  researchers;  third,  to  draw  some  conclusions  based 
on  the  first  two  objectives.  The  analyses  presented  in  this  paper  encompass  various  types  of  distributions  based 
on  the  collected  change  data.  The  most  important  are  the  error  distributions  observed  within  the  software  pro¬ 
ject. 

[Basi83a]  Abstract:  The  emergence  of  Ada  provides  the  opportunity  and  necessity  for  measurement,  analysis, 
and  experimentation  in  software  development.  Over  the  past  several  months,  we  have  been  studying  a  software 
projec  t  developed  in  Ada.  One  of  the  goals  of  the  study  is  to  identify  metrics  which  are  useful  for  evaluating  and 
predicting  the  complexity,  quality,  and  cost  of  Ada  programs.  This  paper  defines  a  set  of  metrics  for  use  with 
software  development  in  Ada.  The  metrics  are  gathered  into  six  categories:  effort,  changes,  dimension,  language 
use,  data  use,  and  execution.  They  are  described  further  using  formula  generators,  distributions,  and  formulas. 
Examples  of  each  metric,  as  well  as  specific  uses,  are  also  included.  Finally,  [the  authors]  continuing  research  in 
this  area  is  described. 

[Basi83b]  Abstract:  The  desire  to  predict  the  effort  in  developing  or  explain  the  quality  of  software  has  led  to  the 
proposal  of  several  metrics  in  the  literature.  As  a  step  toward  validating  these  metrics,  the  Software  Engineering 
Laboratory  has  analyzed  the  Software  Science  metrics,  cyclomatic  complexity,  and  various  standard  program 
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measures  for  their  relation  to  1)  effort  (including  design  through  acceptance  testing),  2)  development  errors 
(both  discrete  and  weighted  according  to  the  amount  of  time  to  locate  and  fix),  and  3)  one  other.  The  data  investi¬ 
gated  are  collected  from  a  production  Fortran  environment  and  examined  across  several  projects  at  once,  within 
individual  projects  and  by  individual  programmers  across  projects,  with  three  efforts  reporting  accuracy  checks 
demonstrating  the  need  to  validate  the  database.  When  the  data  come  from  individual  programmers  or  certain 
validated  projects,  the  metrics’  correlations  with  actual  effort  seem  to  be  the  strongest.  For  modules  developed 
entirely  by  individual  programmers,  the  validity  ratios  induce  a  statistically  significant  ordering  of  several  of  the 
metrics’  correlations.  When  comparing  the  strongest  correlations,  neither  Software  Science’s  E  metric, 
cyclomatic  complexity  nor  source  lines  of  code  appears  to  relate  convincingly  better  with  effort  than  the  others. 

[Basi83c]  Abstract:  A  family  of  syntactic  complexity  metrics  is  defined  that  generates  several  metrics  commonly 
occurring  in  the  literature.  The  paper  uses  the  family  to  answer  some  questions  about  the  relationship  of  these 
metrics  to  error-proneness  and  to  each  other.  Two  derived  metrics  are  applied:  “slope”  which  measures  the  rela¬ 
tive  skills  of  programmers  at  handling  a  given  level  of  complexity  and  “r  square”  which  is  indirectly  related  to  the 
consistency  of  performance  of  the  programmer  or  team.  The  study  suggests  that  individual  differences  have  a 
large  effect  on  the  significance  of  results  where  many  individuals  are  used.  When  an  individual  is  isolated,  better 
results  are  obtained.  The  metrics  can  also  be  used  to  differentiate  between  projects  on  which  a  methodology  was 
used  and  those  on  which  it  was  not. 

[Basi84a]  Abstract:  A  large,  commercially  available  Fortran  program  was  modified  to  produce  structural  cover¬ 
age  metrics.  The  modified  program  was  executed  on  a  set  of  functionally  generated  acceptance  tests  and  a  large 
sample  of  operational  usage  cases.  The  resulting  structural  coverage  metrics  are  combined  with  fault  and  error 
data  to  evaluate  structural  coverage  in  the  SEL  environment. 

We  can  show  that  in  this  environment  the  functionally  generated  test  cases  seem  to  be  a  good  approxima¬ 
tion  of  the  operational  use.  The  relative  proportions  of  the  exercise  statement  subclasses  (executable,  assign¬ 
ment,  CALL,  DO,  IF,  READ,  WRITE)  change  as  the  structural  coverage  of  the  program  increases.  We  pro¬ 
pose  a  method  for  evaluating  if  two  sets  of  input  data  exercise  a  program  in  a  similar  manner. 

We  also  provide  evidence  that  implies  that  in  this  environment,  faults  revealed  in  a  procedure  are  indepen¬ 
dent  of  the  number  of  times  the  procedure  is  executed  and  that  it  may  be  reasonable  to  use  procedure  coverage  in 
software  models  that  use  statement  coverage.  Finally,  the  evidence  suggests  that  it  may  be  possible  to  use  struc¬ 
tural  coverage  to  aid  the  management  of  the  acceptance  test  process. 

[Baai84d]  Abstract:  A  considerable  amount  of  money  and  resources  has  been  spent  on  the  development  of  the 
new  programming  language  Ada.  The  University  of  Maryland  and  General  Electric  have  studied  the  develop¬ 
ment  of  a  software  project  written  in  Ada.  This  paper  presents  the  analysis  of  the  effort,  change,  and  error  data. 
The  total  effort  spent  on  training  and  methodology  was  20%  of  the  total  effort  spent  on  the  project;  this  was  more 
than  the  effort  spent  on  any  other  phase.  The  greatest  error  rates  appeared  to  be  associated  with  the  most  Ada- 
specific  features;  tasking,  generics  and  compilation  units.  Experience  with  high-level  languages  seemed  to  be 
associated  with  a  better  ability  to  grasp  Ada  concepts.  Finally,  the  results  strongly  indicate  the  need  for  support 
tools  for  an  Ada  programming  environment. 

[Basi85a]  Abstract:  Since  both  cost/quality  goals  and  production  environments  differ,  this  study  presents  an 
approach  for  customizing  a  characteristic  set  of  software  metrics  to  an  environment.  The  approach  is  anplied  in 
the  Software  Engineering  Laboratory  (SEL),  a  NASA  Goddard  production  environment,  to  49  candidate  pro¬ 
cess  and  product  metrics  of  652  modules  from  six  (51,000-112,000  line)  projects.  For  this  particular  environment, 
the  method  yielded  the  characteristic  metric  set  (source  lines,  fault  correction  effort  per  executable  statement, 
design  effort,  code  effort,  number  of  I/O  parameters,  number  of  versions).  The  uses  examined  for  a  characteris¬ 
tic  metric  set  include  forecasting  the  effort  for  development,  modification,  and  fault  correction  of  modules  based 
on  historical  data. 

[Basi85b]  Abstract:  This  study  compares  the  strategies  of  code  reading,  functional  testing,  and  structured 
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testing  in  three  aspects  of  software  testing:  fault  detection  effectiveness,  fault  detection  cost,  and  classes  of  faults 
detected.  Thirty  two  professional  programmers  and  42  advanced  students  applied  the  three  techniques  to  four 
unit-sized  programs  in  a  fractional  factorial  experimental  design.  The  major  results  of  this  study  are  the  follow¬ 
ing. 

1.  With  the  professional  programmers,  code  reading  detected  more  software  faults  and  had  a  higher  fault  detec¬ 
tion  rate  than  did  functional  or  structural  testing,  while  functional  testing  detected  more  faults  than  did  struc¬ 
tural  testing,  but  functional  and  structural  testing  were  not  different  in  fault  detection  rate. 

2.  In  one  advanced  student  subject  group,  code  reading  and  functional  testing  were  not  different  in  faults  found, 
but  were  both  superior  to  structural  testing,  while  in  the  other  advanced  student  subject  group  there  was  no 
difference  among  the  techniques. 

3.  With  the  advanced  student  subjects,  the  three  techniques  were  not  different  in  fault  detection  rate. 

4.  Number  of  faults  observed,  fault  detection  rate,  and  total  effort  in  detection  depended  on  the  type  of  software 
tested. 

5.  Code  reading  detected  more  interface  faults  than  did  the  other  methods. 

6.  Functional  testing  detected  more  control  faults  than  did  the  other  methods. 

7.  When  asked  to  estimate  the  percentage  of  faults  detected,  code  readers  gave  the  most  accurate  estimates  while 
functional  testers  gave  the  least  accurate  estimates. 

[Basi85c]  Abstract:  This  paper  presented  a  paradigm  for  evaluating  software  development  methods  and  tools. 
The  basic  idea  is  to  generate  a  set  of  goals  which  are  refined  into  quantifiable  questions  which  specify  metrics  to 
be  collected  on  the  software  development  and  maintenance  process  and  product.  These  metrics  can  be  used  to 
characterize,  evaluate,  predict  and  motivate.  They  can  be  used  in  an  active  as  well  as  passive  way  by  learning 
from  analyzing  the  data  and  improving  the  methods  and  tools  based  upon  what  is  learned  from  that  analysis. 
Several  examples  were  given  representing  each  of  the  different  approaches  to  evaluation. 

[Basi85e]  Abstract:  Estimating  the  amount  of  effort  required  for  a  software  development  project  is  one  of  the 
major  aspects  of  resource  estimation  for  that  project.  In  this  study,  the  relationship  between  effort  and  other 
variables  for  23  Software  Engineering  Laboratory  projects  that  were  developed  for  NASA/Goddard  Space 
Flight  Center  was  examined.  These  variables  fell  into  categories:  those  which  can  be  determined  in  the  early 
stages  of  project  development  and  may  therefore  be  useful  in  a  baseline  equation  for  predicting  effort  in  future 
projects,  and  those  which  can  be  used  mainly  to  characterize  or  evaluate  effort  requirements  and  thus  enhance 
[the  authors]  understanding  of  the  software  development  process  in  this  environment.  Some  results  of  the  ana¬ 
lyses  are  presented  in  this  paper. 

[Basi85h]  Abbreviated  Introduction:  This  article  examines  the  use  of  Ada  in  a  software  project  developed  by 
the  General  Electric  Company.  The  project  was  monitored  by  the  University  of  Maryland  and  GE  to  identify 
areas  of  success  and  difficulty  in  learning  and  using  Ada  as  both  a  design  and  a  coding  language.  Since  produc- 
tion-qualitv  Ada  translators  were  not  readily  available,  the  study  focused  on  training  and  early  software  develop¬ 
ment.  We  focus  on  the  use  and  effect  of  Ada  on  this  project,  which  was  conducted  primarily  in  1982.  Our  study 
also  presents  the  major  factors  to  consider  before  using  Ada  in  software  development,  particularly  when  training 
in  Ada  is  necessary.  Although  many  of  [the  authors]  conclusions  may  seem  obvious  now,  they  were  unexpected 
when  this  project  began. 

[The  authors]  study  attempts  to  meet  several  goals.  The  first  focuses  on  characterization  of  the  effort,  the 
changes,  and  the  errors  of  the  project.  The  second  considers  how  Ada  was  used  on  the  project.  The  third  con¬ 
cerns  evaluation  of  the  data  collection  and  validation  process,  while  the  fourth  concentrates  on  the  development 
of  measures  for  the  Ada  Programming  Support  Environment. 

[Basi86a]  Abstract:  Experimentation  in  software  engineering  supports  the  advancement  of  the  field  through  an 
iterative  learning  process.  In  this  paper  we  present  a  framework  for  analyzing  most  of  the  experimental  work 
performed  in  software  engineering  over  the  past  several  years.  We  describe  a  variety  of  experiments  in  the  frame¬ 
work  and  discuss  their  contribution  to  the  software  engineering  discipline.  Some  useful  recommendations  for  the 


196 


August  9,  1989 


application  of  the  experimental  process  in  software  engineering  are  included. 

[Basi87a]  Abstract:  More  and  more  project  environments  will  make  the  transition  from  traditional  implementa¬ 
tion  languages  to  Ada.  In  this  context,  many  open  questions  need  to  be  answered,  e.g.,  whether  or  not  Ada 
language  features  and  concepts  are  used  appropriately,  and  how  Ada  projects  should  be  managed  and  supported 
by  methods  and  tools.  It  is  therefore  necessary  to  measure  and  evaluate  the  quality  and  productivity  of  process 
and  product  aspects  of  Ada  projects.  This  can  be  done  by  either  conducting  case  studies  of  ongoing  Ada  projects 
or  experiments  in  controlled  environments.  In  both  cases  concrete  measurement  and  evaluation  goals  need  to  be 
established  in  a  systematic  way,  measures  need  to  be  derived  that  can  help  in  achieving  these  goals,  and  the 
necessary  data  need  to  be  collected,  validated  and  interpreted.  We  have  established  a  methodology  that  allows  us 
to  perform  these  activities  in  a  systematic  way.  However,  the  methodology  must  be  supported  by  automated  tools 
in  order  to  allow  on-line  feedback  of  evaluation  results  into  ongoing  projects.  In  the  long-run,  these  tools  for 
on-line  feedback  should  become  part  of  each  APSE  supporting  the  decision  making  process  of  management, 
development,  quality  assurance  personnel,  and  others.  Such  information  would  be  based  on  data  from  the 
current  project  as  well  as  previous  projects  in  the  same  and  other  environments.  In  this  paper  we  present  and  dis¬ 
cuss  the  TAME  (Tailoring  an  Ada  Measurement  Environment)  project  which  aims  at  the  development  of  a  pro¬ 
totype  measurement  and  evaluation  environment  that  supports  all  the  previously  mentioned  activities  including 
the  process  of  setting  up  measurement  and  evaluation  goals  and  deriving  supportive  measures.  We  discuss  the 
TAME  requirements  and  architectural  design,  the  status  of  the  first  prototype,  and  the  expected  impact  of  this 
project  on  Ada  projects  and  APSEs.  The  prototype  currently  under  development  does  not  interface  with  an 
APSE;  however,  it  is  designed  for  being  integrated  into  an  APSE  in  the  future. 

[Basi87b]  Abstract:  This  paper  presents  a  methodology  for  improving  the  software  process  by  tailoring  it  to  the 
specific  project  goals  and  environment.  This  improvement  process  is  aimed  at  the  global  software  process  model 
as  well  as  methods  and  tools  supporting  that  model.  The  basic  idea  is  to  use  defect  profiles  to  help  characterize 
the  environment  and  evaluate  the  project  goals  and  the  effectiveness  of  methods  and  tools  in  a  quantitative  way. 
The  improvement  process  is  implemented  iteratively  by  setting  project  improvement  goals,  characterizing  those 
goals  and  the  environment,  in  part,  via  defect  profiles  in  a  quantitative  way,  choosing  methods  and  tools  fitting 
those  characteristics,  evaluating  the  actual  behavior  of  the  chosen  set  of  methods  and  tools,  and  refining  the  pro¬ 
ject  goals  based  on  the  evaluation  results.  All  these  activities  require  analysis  of  large  amounts  of  data  and,  there¬ 
fore,  support  by  an  automated  tool.  Such  a  tool  -  TAME  (Tailoring  A  Measurement  Environment)  -  is  currently 
being  developed. 

[Basi88]  Abstract:  Experience  from  a  dozen  years  of  analyzing  software  engineering  processes  and  products  is 
summarized  as  a  set  of  software  engineering  and  measurement  principles  that  argue  for  software  engineering  pro¬ 
cess  models  that  integrate  sound  planning  and  analysis  into  the  construction  process. 

In  the  TAME  (Tailoring  A  Measurement  Environment)  project  at  the  University  of  Maryland  we  have 
developed  such  an  improvement-oriented  software  engineering  process  model  that  uses  the  goal/ques¬ 
tions/metric  paradigm  to  integrate  the  constructive  and  analytic  aspects  of  software  development.  The  model 
provides  a  mechanism  for  formalizing  the  characterization  and  planning  tasks,  controlling  and  improving  pro¬ 
jects  based  on  quantitative  analysis,  learning  in  a  deeper  and  more  systematic  way  about  the  software  process 
and  product,  and  feeding  the  appropriate  experience  back  into  the  current  and  future  projects. 

The  TAME  system  is  an  instantiation  of  the  TAME  software  engineering  process  model  as  an  ISEE 
(Integrated  Software  Engineering  Environment).  The  first  in  a  series  of  TAME  system  prototypes  has  been 
developed.  An  assessment  of  experience  with  this  first  limited  prototype  is  presented  including  a  reassessment  of 
its  initial  architecture.  The  long-term  goal  of  this  building  effort  is  to  develop  a  better  understanding  of  appropri¬ 
ate  ISEE  architectures  that  optimally  support  the  improvement-oriented  TAME  software  engineering  process 
model. 

[Bate83a]  Abbreviated  Introduction:  In  this  paper  we  consider  the  Behavioral  Abstraction  (BA)  approach  to 
high-level  debugging  of  distributed  systems.  In  Section  2,  we  discuss  behavioral  abstraction  and  the  Event 
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Definition  Language  that  is  the  basis  for  a  debugging  tool  implementing  this  approach.  Section  3  addresses  one  of 
the  fundamental  issues  arising  in  actually  providing  debugging  aid  through  the  BA  approach,  that  of  recognizing 
the  occurrence  of  abstracted  behaviors.  We  conclude  the  paper  with  an  assessment  of  [the  authors]  present  status 
and  outstanding  problems. 

[Baue79b]  Overview:  Formal  program  construction  by  transformations  is  a  method  of  software  development  in 
which  a  program  is  derived  from  a  formal  problem  specification  by  manageable,  controlled  transformation  steps 
which  guarantee  that  the  final  product  meets  the  initial  specification.  This  methodology  has  been  investigated  in 
the  Munich  project  CIP  (computer-aided,  intuition-guided  programming).  The  research  includes  the  design  of  a 
wide-spectrum  language  specifically  tailored  to  the  needs  of  transformational  programming,  the  construction  of  a 
transformation  system  to  support  the  methodology,  and  the  study  of  transformation  rules  and  other  methodologi¬ 
cal  issues.  Particular  emphasis  has  been  laid  on  developing  a  sound  theoretical  basis  for  the  overall  approach. 

[Baue89]  Abstract:  Formal  program  construction  by  transformations  is  a  method  of  software  development  in 
which  a  program  is  derived  from  a  formal  problem  specification  by  manageable,  controlled  transformation  steps 
which  guarantee  that  the  final  product  meets  the  initial  specification.  This  methodology,  has  been  investigated  in 
the  Munich  project  CIP  (computer-aided,  intuition-guided  programming).  The  research  includes  the  design  of  a 
wide-spectrum  language  specifically  tailored  to  the  needs  of  transformational  programming,  the  construction  of  a 
transformation  system  to  support  the  methodology,  and  the  study  of  transformation  rules  and  other  methodologi¬ 
cal  issues.  Particular  emphasis  has  been  laid  on  developing  a  sound  theoretical  basis  for  the  overall  approach. 

[Bazz82]  Abstract:  A  new  method  for  testing  compilers  is  presented.  The  compiler  is  exercised  by  compilable 
programs,  automatically  generated  by  a  test  generator.  The  generator  is  driven  by  a  tabular  description  of  the 
source  language.  This  description  is  in  a  formalism  which  nicely  extends  context-free  grammars  in  a  context- 
dependent  direction,  but  still  retains  the  structure  and  reliability  of  BNF.  The  generator  produces  a  set  of  pro¬ 
grams  which  cover  all  grammatical  constructions  of  the  source  language,  unless  user  supplied  directives  instruct 
otherwise.  The  programs  generated  can  also  be  used  to  evaluate  the  performance  of  difference  compilers  of  the 
same  source  language. 

A  significant  example  from  Pascal  is  presented,  and  experience  with  the  generator  is  reported. 

[Beek76]  Abstract:  Programs  which  perform  partial  evaluation,  beta-expansion,  and  certain  optimizations  on 
programs,  are  studied  with  respect  to  implementation  and  application.  Two  implementations  are  described,  one 
“interpretive”  partial  evaluation,  which  operated  directly  on  the  program  to  be  partially  evaluated,  and  a  “com¬ 
piling”  system,  where  the  program  to  be  partially  evaluated  is  used  to  generate  a  specialized  program,  which  in  its 
turn  is  executed  to  do  the  partial  evaluation.  Three  applications  with  different  requirements  on  these  programs 
are  described.  Proofs  are  given  for  the  equivalence  of  the  use  of  the  interpretive  system  and  the  compiling  system 
in  two  of  the  three  cases.  The  general  use  of  the  partial  evaluator  as  a  tool  for  the  programmer  in  conjunction 
with  certain  programming  techniques  is  discussed. 

[Behr83]  Abstract:  The  function  point  method  of  measuring  application  development  productivity  developed  by 
Albrecht  is  reviewed  and  a  productivity  improvement  measure  introduced.  The  measurement  methodology  is 
then  applied  to  24  development  projects.  Size,  environment,  and  language  effects  on  productivity  are  examined. 
The  concept  of  a  productivity  index  which  removes  size  effects  is  defined  and  an  analysis  of  the  statistical  signifi¬ 
cance  of  results  is  presented. 

[Beiz83]  Table  of  Contents:  Introduction.  The  taxonomy  of  bugs.  Flowcharts  and  path  testing.  Path  testing  and 
transaction  flows.  Graphs,  paths  and  complexity.  Paths,  path  products,  and  regular  expressions.  Data  validation 
and  syntax  testing.  Data-base-driven  testing  design.  Decision  tables  and  boolean  algebra,  Boolean  algebra  the 
easy  way.  States,  state  graphs,  and  transition  testing.  Graph  matrices  and  applications. 

[Bela76]  Abbreviated  Introduction:  As  a  need  for  a  discipline  of  software  engineering  has  been  recognized,  the 
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design,  implementation,  and  maintenance  of  computer  software  has  come  into  the  forefront.  The  formulation  of 
concepts  of  programming  methodology,  exemplified  by  Dijkstra’s  structured  programming,  strikes  at  the  roots  of 
the  problem.  The  realization  is  that  a  program,  much  as  a  mathematical  theorem,  should  and  can  be  provable. 
Recognition  that  a  program  can  be  proved  correct  as  it  is  developed  and  maintained,  and  before  its  results  are 
used,  may  ultimately  change  the  nature  of  the  programming  task  and  the  face  of  the  programming  ".orld.  Clearly 
these  developments  are  of  fundamental  importance.  They  appear  to  point  to  long-term  solutions  to  problems 
that  will  be  encountered  in  creating  the  great  amount  of  program  text  that  the  world  appears  to  require.  But  even 
though  progress  in  mastering  the  science  of  program  creation,  maintenance,  and  expansion  has  also  been  made, 
there  is  still  a  long  way  to  go. 

[Bcla81]  Abstract:  Program  modules  and  data  structures  are  interconnected  by  calls  and  references  in  software 
systems.  Partitioning  these  entities  into  clusters  reduces  complexity.  For  very  large  systems  manual  clustering  is 
impractical.  A  method  to  perform  automatic  clustering  is  described  and  a  metric  to  quantify  the  complexity  of 
the  resulting  partition  is  developed. 

[Belk86]  Abstract:  The  development  of  correct  specifications  is  a  critical  task  in  the  software  development  pro¬ 
cess.  This  paper  describes  an  alternative  approach  for  the  development  of  specifications.  The  approach  relies  on 
a  specification  language  for  abstract  data  types  and  a  synthesis  system.  The  system  is  capable  of  translating  an 
abstract  data  type  specification  into  an  executable  program.  This  process  defines  an  alternative  methodology  that 
provides  the  necessary  tools  for  the  early  testing  of  the  specifications  and  for  the  development  of  prototypes  and 
implementation  models. 

[Bene85]  Abstract:  Modern  complex  system  reliability  has  to  take  into  account  more  and  more  programmed 
system  reliability.  This  raises  two  kinds  of  problems: 

—  Software  reliability:  i.e.,  software  quality  assurance,  specifications,  development  methods,  languages,  pro¬ 
gramming,  test  policies,... 

—  Hardware  reliability  at  three  levels:  input,  processors,  output. 

A  method  among  others  enabling  to  modelize  software  and  hardware  behaviour  from  the  reliability  point  of  view 
is  the  stochastic  Markov  process  method. 

First,  the  principles  of  the  method  will  be  given  and  advantages  will  be  pointed  out  in  comparison  with 
other  more  classical  methods  for  reliability  analysis. 

In  the  second  part  of  the  paper,  software  tools  to  solve  this  kind  of  problems  will  be  described  and  in  the 
final  part  of  the  presentation  an  example  of  successful  use  of  these  computer  codes  will  be  given. 

[Beng87]  Abstract:  Operational  analysis,  an  area  of  study  first  defined  in  the  computer  science  field,  has  been 
used  in  the  analysis  of  systems  performance.  System  performance  measures  for  a  specific  set  of  output  data  are 
obtained  using  operational  analysis  formulas  derived  from  assumptions  which  are  verifiable  by  the  observed 
data.  This  paper  gives  relationships  which  may  be  used  to  quantify  the  errors  in  these  assumptions.  Additionally, 
basic  propositions  are  given  which  help  in  understanding  operational  analysis  assumptions.  These  propositions 
are  used  in  developing  correction  terms  which  can  be  used  to  adjust  performance  measures  so  that  their  values 
are  exact  for  a  set  of  data  no  matter  how  much  the  assumptions  used  in  deriving  the  performance  measure  rela¬ 
tions  are  violated. 

[Bens81]  Abstract:  An  experiment  was  performed  in  which  executable  assertions  were  used  in  conjunction  with 
search  techniques  in  order  to  test  a  computer  program  automatically.  The  program  chosen  for  the  experiment 
computes  a  position  on  an  orbit  from  the  description  of  the  orbit  and  the  desired  point. 

Errors  were  interested  in  the  program  randomly  using  an  error  generation  method  based  on  published 
data  defining  common  error  types.  Assertions  were  written  for  program  and  it  was  tested  using  two  different 
techniques.  The  first  divided  up  the  range  of  the  input  variables  and  selected  test  cases  from  within  the 
subranges.  In  this  way  a  “grid”  of  test  values  was  constructed  over  the  program’s  input  space. 

The  second  used  a  search  algorithm  from  optimization  theory.  This  entailed  using  the  assertions  to  define 
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an  error  function  and  then  maximizing  its  value.  The  program  was  then  tested  by  varying  all  of  them.  The  results 
indicate  that  this  search  testing  technique  was  as  effective  as  the  grid  testing  technique  in  locating  errors  and  was 
more  efficient.  In  addition,  the  search  testing  technique  located  critical  input  values  which  helped  in  writing 
correct  assertions. 

[Bera83]  Introduction:  This  column  is  the  first  in  a  series  of  articles  dealing  with  the  utilization  of  Ada  in 
software  engineering.  All  examples  will  use  Ada,  even  though  virtually  all  of  the  techniques  discussed  could 
apply  to  any  programming  language.  This  will  be  particularly  advantageous  when  we  wish  to  compare  Ada  to 
other  programming  languages. 

[Berg82]  Table  of  Contents:  Introduction  to  fundamental  techniques.  Formal  models  of  computation.  Verifica¬ 
tion  methods  and  techniques.  Approaches  to  proofs  of  partial  correctness.  Approaches  to  proofs  of  total 
correctness.  Correctness  of  parallel  programs.  Application  of  verification  approaches.  Approaches  to  specifica¬ 
tion.  State  of  the  art  and  summary.  References. 

[Berl80]  Introduction:  There  have  been  numerous  measures  proposed  to  measure  program  complexity.  Some 
are  completely  heuristic,  comparing  certain  measurable  program  features  against  a  set  of  predefined  standards. 
Some  are  topological,  based  on  the  number  of  regions  on  the  control  or  data  graph  of  the  program  or  a  combina¬ 
tion  of  the  avove,  and  of  course,  there  is  Halstead’s  Software  Physics. 

All  of  these  measures  have  their  deficiencies  and,  no  doubt,  so  will  ours.  We  have,  however,  set  ourselves 
the  goal  of  eliminating  some  of  them  and  to  provide  a  measure  which  has  mathematical  and  intuitive  correctness 
and  which  will  have  a  good  correlation  with  observed  facts. 

[Bern84]  Introduction:  The  job  of  software  maintenance-correcting  errors  and  changing  program  operation  as 
requirements  change-generally  devolves  upon  personnel  not  involved  in  the  original  software  development  cycle 
who  must  learn  how  a  program  works  before  they  can  competently  change  it.  Among  the  variables  involved  in 
this  learning  process  are  the  accuracy,  currency,  and  completeness  of  program  documentation;  programmer  skill 
and  experience;  environmental  factors  such  as  urgency,  the  programming  language,  and  especially,  the  attributes 
of  the  program  itself. 

Program  maintainability  and  program  under standability  are  parallel  concepts:  the  more  difficult  a  program 
is  to  understand,  the  more  difficult  it  is  to  maintain.  And  the  more  difficult  it  is  to  maintain,  the  higher  its  main¬ 
tainability  risk.  Since  it  is  to  the  source  program  that  maintenance  staff  must  ultimately  come,  it  would  be  useful 
to  be  able  to  quantify  the  relative  magnitude  of  the  task  through  an  analysis  solely  of  the  attributes  of  the  pro¬ 
gram. 

Attempts  have  been  made  to  quantify  program  difficulty  by  manipulative  simple  counts  of  selected  pro¬ 
gram  attributes,  e.g.,  lines  of  code,  bifurcation  points  (cyclomatic  number),  and  operations  and  operands  (Hal¬ 
stead  length).  Although  these  manipulations  may  be  informative,  none  has  been  persuasively  shown  to  be  a  reli¬ 
able  measure  of  program  difficulty. 

This  paper  presents  an  approach  based  on  the  tenet  that  program  difficulty  represents  the  sum  of  the  diffi¬ 
culties  of  its  constituent  elements,  and  that  these  elements  can  be  quantified  by  the  use  of  carefully  selected 
weights  and  factors. 

[Berr87]  Abstract:  In  carrying  out  SDC’s  Formal  Development  Method,  one  writes  a  specification  of  a  system 
under  design  in  the  Ina  Jo  specification  language  and  proves  that  the  specification  meets  the  requirements  of  the 
system.  This  paper  develops  an  abstract  machine  model  of  what  is  specified  by  a  level  specification  in  an  Ina  Jo 
specification.  It  describes  the  state  as  defined  by  the  front  matter,  computations  as  defined  by  initial  states  and 
transforms,  and  invariants,  criteria,  and  constraints  as  properties  of  computations.  The  paper  then  describes  a 
number  of  formal  design  methods  and  the  kinds  of  abstractions  that  they  require.  For  each  of  these  kinds  of 
abstractions,  there  is  a  characteristic  relationship  between  refinements  that  should  be  proved  as  one  is  carrying 
out  the  method. 
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[Besh85]  Abstract:  Hardware  manufacturers  usually  provide  a  “data  sheet”  listing  parameters  that  describe  the 
hardware’s  functionality  and  limitations  on  its  use.  This  paper  explores  the  feasibility  of  developing  a  similar 
document  to  describe  software  quality.  It  defines  parameters  that  are  useful  for  reporting  quality  in  three  major 
areas:  reliability,  maintainability,  and  robustness. 

[Bess87]  Overview:  The  key  purpose  of  the  Test  Environment  Generator  (GET)  is  to  provide  the  automatic  gen¬ 
eration  of  drivers  and  possible  virtual  bodies  associated  with  a  software  component.  Drivers  and  virtual  bodies 
will  constitute  together  with  the  component  itself,  a  complete  compilable,  linkable  and  executable  program.  This 
program  (called  Test  Environment)  is  interactive;  when  executed,  it  allows  the  user  to  perform  most  of  the 
operations  that  are  usually  executed  through  a  test  program.  For  example,  the  user  can  create  variables,  assign 
objects,  call  subprograms,  indicating  the  values  of  in  or  in-out  parameters,  and  examine,  after  the  call,  the  result 
values. 

In  addition  to  important  time  saving,  the  automatic  generation  of  a  testing  environment  facilitates  the  per¬ 
formance  of  the  tests  by  a  separate  testing  team.  It  also  adds  to  the  efficiency  and  comfort  of  such  a  team  by  pro¬ 
viding  a  standard  and  powerfiil  user  interface  to  be  used  for  all  the  components  to  be  tested. 

[Bish86]  Abstract:  The  Project  On  Diverse  Software  (PODS)  was  a  collaborative  software  reliability  research 
project  whose  main  objectives  were: 

•  To  evaluate  the  merits  of  using  diverse  (or  n-version)  software. 

•  To  evaluate  the  computer-based  specification  language  “X.” 

•  To  compare  the  effects  of  representative  high-level  and  low-level  languages  on  productivity  and  reliability. 

In  addition,  there  was  a  secondary  objective  to  monitor  the  software  development  process  to  produce 
three  diverse  programs  to  the  same  requirements.  The  requirement  was  for  a  reactor  over-power  protection 
(trip)  system.  Diversity  was  ensured  by  having  three  independent  teams  to  produce  the  software,  using  different 
specification  methods  (formal  and  informal;  and  different  implementation  languages  (assembly  language  and 
Fortran).  This  also  allowed  the  comparison  of  specification  methods  and  programming  languages  to  be  made. 
After  careful  independent  development  and  testing,  the  three  programs  were  tested  against  each  other  in  a  spe¬ 
cial  test  harness  to  locate  residual  faults.  All  phases  of  the  project  were  carefully  documented  for  subsequent 
analysis. 

The  major  conclusions  for  this  particular  project  were  that: 

•  Diverse  software  with  majority  voting  failed  less  frequently  than  any  individual  program,  but  some  common 
faults  did  exist  at  the  end  of  normal  software  development. 

•  Testing  diverse  programs  “back-to-back”  proved  to  be  a  powerful  method  of  detecting  residual  faults. 

•  The  residual  faults  were  all  related  to  the  specification  of  requirements,  and  hence,  the  requirement  specifica¬ 
tion  was  the  only  known  cause  of  common  mode  failure. 

[Bjor87]  Abstract:  We  propose  a  total  framework  for  the  software  development  stages  of  specification  (defini¬ 
tion),  design  and  coding.  This  framework  is  based  on  three  cornerstones:  (a)  the  concept  of  software  develop¬ 
ment  graphs  which  specify  all 'the  stages  and  steps  of  development;  (b)  the  use  of  formal  methods,  in  [the 
authors]  case  VDM,  the  Vienna  Software  Development  Method,  in  all  stages  and  steps  of  development;  and  (c) 
the  clearly  separate  roles  of  theoretical  computer  scientists,  programmers,  software  engineers,  and  development 
managers  in  all  aspects  of  software  development.  Thus  not  only  programming  is  formalised  (i.e.,  the  entire  pro¬ 
gramming  itself  is  also  considered  a  formal  object  about  which  to  reason). 

[Blac81]  Abstract:  The  addition  of  redundancy  to  data  structures  can  be  used  to  improve  the  ability  of  a  software 
system  to  detect  and  correct  errors,  and  to  continue  to  operate  according  to  its  specifications.  A  case  study  is 
presented  which  indicates  how  such  redundancy  can  be  deployed  and  exploited  at  reasonable  cost  to  improve 
software  fault  tolerance.  Experimental  results  are  reported  for  the  small  data  base  system  considered. 

[Blai71]  Abstract:  The  Purdue  Extendable  Debugging  system  (PEBUG)  is  a  general  purpose  debugging  system 
which  operates  under  the  Purdue  version  of  the  Mace  operating  system  on  the  CDC  6500.  PEBUG  is  designed  to 
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provide  flexible  debugging  in  either  an  interactive  or  non-interactive  mode.  The  basic  construction  of  PEBUG 
primitives,  debugging  commands,  and  the  interface  used  for  extension,  are  described. 

[Blai85a]  Abstract:  This  paper  presents  the  results  of  a  study  of  the  software  complexity  characteristics  of  a 
large  real-time  signal  processing  system  for  which  there  is  a  6-yr  maintenance  history.  The  objective  of  the  study 
was  to  compare  values  generated  by  software  metrics  to  the  maintenance  history  in  order  to  determine  which 
software  complexity  metrics  would  be  most  useful  for  estimating  maintenance  effort.  The  metrics  that  were 
analyzed  were  program  size  measures,  software  science  measures,  and  control  flow  measures.  During  the  course 
of  the  study  two  new  software  metrics  were  defined.  The  new  metrics,  maximum  knot  depth  and  knots  per  jump 
ratio,  are  both  extensions  of  the  knot  count  metric.  When  comparing  the  metrics  to  the  maintenance  data  the 
control  flow  measures  showed  the  strongest  positive  correlation. 

[BI0086]  Abstract:  The  increasing  use  of  computers  to  protect  or  control  potentially  hazardous  processes  leads 
to  a  need  for  effective  methods  to  assess  the  software  they  execute.  This  correspondence  presents  a  case  study  in 
which  the  Vienna  development  method  (VDM),  a  formal  specification  and  development  methodology,  was  used 
during  the  analysis  phase  of  the  assessment  of  a  prototype  nuclear  reactor  protection  system.  The  VDM  specifi¬ 
cation  was  also  translated  into  the  logic  language  Prolog  to  animate  the  specification  and  to  provide  a  diverse 
implementation  for  use  in  back-to-back  testing.  The  authors  claim  that  this  technique  provides  a  visible  and 
effective  method  of  analysis  which  is  superior  to  the  informal  alternatives. 

[Blum75]  Abstract:  Intelligence  tests  occasionally  require  the  extrapolation  of  an  effective  sequence  (e.g.  1661, 
2552,  3663,  ...)  that  is  produced  by  some  easily  discernible  algorithm.  In  this  paper,  we  investigate  the  theoretical 
capabilities  and  limitations  of  a  computer  to  infer  such  sequences.  We  design  Turing  machines  that  in  principle 
are  extremely  powerful  for  this  purpose  and  place  upper  bounds  on  the  capabilities  of  machines  that  would  do 
better. 

[Boch87b]  Abstract:  The  use  of  formal  specifications  in  software  development  allows  the  use  of  certain 
automated  tools  during  the  specification  and  software  development  process.  Formal  description  techniques  have 
been  developed  for  the  specification  of  communication  protocols  and  services.  This  paper  describes  the  partial 
automation  of  the  protocol  implementation  process  based  on  a  formal  specification  of  the  protocol  to  be  imple¬ 
mented.  An  implementation  strategy  and  a  related  software  structure  for  the  implementation  of  state  transition 
oriented  specifications  is  presented.  Its  application  is  demonstrated  with  a  much  simplified  Transport  protocol. 
The  automated  translation  of  specifications  into  implementation  code  in  a  high-level  language  is  also  discussed. 
A  semiautomated  implementation  strategy  is  explained  which  highlights  several  refinement  steps,  part  of  which 
are  automated,  which  lead  from  a  formal  protocol  specification  to  an  implementation.  Experience  with  several 
full  implementations  of  the  OSI  Transport  protocol  is  described. 

[Boeh75a]  Abstract:  This  paper  summarizes  some  recent  experience  in  analyzing  and  eliminating  sources  of 
error  in  the  design  phase  of  large  software  projects.  It  begins  by  pointing  out  some  of  the  significant  differences 
in  software  error  incidence  between  large  and  small  software  projects.  The  most  striking  contrast,  illustrated  by 
project  data,  is  the  large  preponderance  of  design  errors  over  coding  errors  on  large  scale  projects,  not  only  with 
respect  to  numbers  of  errors,  but  also  with  respect  to  the  relative  time  and  effort  required  to  detect  them  and 
correct  them. 

The  paper  next  presents  a  taxonomy  of  software  error  causes,  and  some  analyses  of  the  design  error  data, 
performed  to  obtain  a  better  understanding  of  the  nature  of  large-scale  software  design  errors  and  to  evaluate 
alternative  methods  of  preventing,  detecting,  and  eliminating  them. 

Based  on  this  analysis  of  observational  data,  a  hypothesis  was  derived  regarding  the  potential  cost-effec¬ 
tiveness  of  an  automated  aid  to  detecting  inconsistencies  between  assertions  about  the  nature  of  inputs  and  out¬ 
puts  of  the  various  elements  (functions,  modules,  data  bases,  data  sources,  etc.)  of  the  software  design.  This 
hypothesis  was  tested  by  developing  a  prototype  version  of  such  an  aid.  the  Design  Assertion  Consistency 
Checker  (DACC),  using  TRW’s  Generalized  Information  Management  (GIM)  System,  and  using  it  on  a 
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large-scale  software  project  with  186  elements  and  967  assertions  about  their  inputs  and  outputs. 

Of  the  121,000  possible  mismatches  between  input  and  output  assertions,  DACC  found  818,  at  a  cost  in 
computer  time  of  $30.  Most  of  the  mismatches  resulted  from  shortfalls  in  the  initial  version  of  DACC  or  the  ini¬ 
tial  data  preparation,  such  as  lack  of  a  synonym  capability  and  a  lack  of  explicit  statements  about  external  inputs 
and  outputs.  However,  a  number  of  serious  mismatches  were  exposed  at  a  time  when  they  were  easy  to  correct, 
and  a  most  useful  worklist  generated  of  items  needing  resolution  before  allowing  the  design  effort  to  proceed  to 
further  detail. 

In  general,  the  data  confirmed  the  hypothesis  about  the  general  utility  of  a  DACC  capability  for  large 
software  projects.  However,  a  number  of  additional  features  should  be  considered  to  compensate  for  current 
deficiencies  (in  areas  such  as  manuscript  preparation)  and  to  fully  take  advantage  of  having  the  software  design  in 
machine-readable  form. 

[Boeh75b]  Introduction:  The  high  cost  of  software  should  be  considered  more  of  an  opportunity  than  a  prob¬ 
lem.  Nobody  can  say  for  sure  to  what  extent  software  “costs  too  much.”  However,  the  high  cost  implies  that 
additional  improvements  will  lead  to  significant  savings,  which  should  justify  additional  investments  to  stream¬ 
line  some  of  the  institutions,  techniques,  and  procedures  which  often  hinder  software  productivity. 

This  chapter  will  address  three  main  questions:  How  high  is  the  cost  of  software?  Where  do  costs  go? 
What  factors  influence  costs?  (or,  what  can  we  do  about  them?) 

For  reference,  “software  production”  here  includes  all  the  effort  involved  in  producing  and  maintaining 
the  necessary  executive,  support,  and  applications  programs  and  their  documentation,  starting  from  a  reason¬ 
ably  well-defined  functional  specification.  Most  of  the  software  data  comes  from  the  Air  Force,  primarily  from 
the  CCIP-85  study  and  the  recent  Air  Force-Industry  Software  Cost  Workshop,  but  they  are  probably  fairly 
representative  of  other  software  activities  elsewhere. 

[Boeb78]  Abstract:  The  study  reported  in  this  paper  establishes  a  conceptual  framework  and  some  key  initial 
results  in  the  analysis  of  the  characteristics  of  software  quality.  Its  main  results  and  conclusions  are: 

•  Explicit  attention  to  characteristics  of  software  quality  can  lead  to  significant  savings  in  software  life-cycle 
costs. 

•  The  current  software  state-of-the-art  imposes  specific  limitations  on  our  ability  to  automatically  and  quantita¬ 
tively  evaluate  the  quality  of  software. 

•  A  definitive  hierarchy  of  well-defined,  well-differentiated  characteristics  of  software  quality  is  developed.  Its 
higher-level  structure  reflects  the  actual  uses  to  which  software  quality  evaluation  would  be  put;  its  lower-level 
characteristics  are  closely  correlated  with  actual  software  metric  evaluations  which  can  be  performed. 

•  A  large  number  of  software  quality-evaluation  metrics  have  been  defined,  classified,  and  evaluated  with 
respect  to  their  potential  benefits,  quantifiability,  and  ease  of  automation. 

•  Particular  software  life-cycle  activities  have  been  identified  which  have  significant  leverage  on  software  qual¬ 
ity. 

Most  importantly,  we  believe  that  the  study  reported  in  this  paper  provides  for  the  first  time  a  clear,  well- 
defined  framework  for  assessing  the  often  slippery  issues  associated  with  software  quality,  via  the  consistent  and 
mutually  supportive  sets  of  definitions,  distinctions,  guidelines,  and  experiences  cited.  This  framework  is  cer¬ 
tainly  not  complete,  but  it  has  been  brought  to  a  point  sufficient  to  serve  as  a  viable  basis  for  future  refinements 
and  extensions. 

The  bulk  of  the  work  reported  in  this  book  was  performed  in  a  study  by  TRW  for  the  National  Bureau  of 
Standards  in  1973.  The  book  presents  this  original  material  and  subsequent  updates  in  the  following  order: 

•  A  preface  which  introduces,  summarizes,  and  updates  the  1973  study; 

•  The  text  of  the  1973  study; 

•  A  revised  and  updated  version  of  the  annotated  bibliography  prepared  for  the  1973  study. 

[Boeh81]  Abbreviated  Preface:  A  course  in  engineering  economics  has  become  a  fairly  standard  component  of 
the  hardware  engineer’s  education.  So  far,  the  opportunities  for  software  engineers  to  take  a  similar  course 
tailored  to  software  engineering  economics  have  been  rare.  As  a  result,  [the  author  thinks]  most  software 
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engineers  miss  out  on  a  chance  to  acquire  and  use  a  number  of  significant  economic  concepts,  techniques,  and 
facts  which  can  play  a  vital  part  in  their  future  careers— and  a  vital  part  in  making  our  software  easier  to  live  with 
and  more  worthwhile. 

Not  surprisingly,  then,  the  major  objective  of  this  book  is  to  provide  a  basis  for  a  software  engineering 
economics  course,  intended  to  be  taken  at  the  college  senior/first-year  graduate  level. 

[Boeh84a]  Abstract:  In  this  experiment,  seven  software  teams  developed  versions  of  the  same  small-size 
(2000-4000  source  instruction)  application  software  product.  Four  teams  used  the  Specifying  approach.  These 
teams  used  the  Prototyping  approach. 

The  main  results  of  the  experiment  were  the  following. 

1.  Prototyping  yielded  products  with  roughly  equivalent  performance,  but  with  about  40  percent  less  code  and  45 
percent  less  effort. 

2.  The  prototyped  products  rated  somewhat  lower  on  functionality  and  robustness,  but  higher  on  ease  of  use  and 
ease  of  learning. 

3.  Specifying  produced  more  coherent  designs  and  software  that  was  easier  to  integrated. 

The  paper  presents  the  experimental  data  supporting  these  and  a  number  of  additional  conclusions. 

[Boeh84b]  Introduction:  A  major  effort  at  improving  productivity  at  TRW  led  to  the  creation  of  the  software 
productivity  project,  or  SPP,  in  1981.  The  major  thrust  of  this  project  is  the  establishment  of  a  software  develop¬ 
ment  environment  to  support  project  activities;  this  environment  is  called  the  software  productivity  system,  or 
SPS.  It  involves  a  set  of  strategies,  including  the  work  environment;  the  evaluation  and  procurement  of 
hardware  equipment;  the  provision  for  immediate  access  to  computing  resources  through  local  area  networks; 
the  building  of  an  integrated  set  of  tools  to  support  the  software  development  life  cycle  and  all  project  personnel; 
and  a  user  support  function  to  transfer  new  technology.  All  of  these  strategies  are  being  accomplished  incremen¬ 
tally.  The  current  architecture  is  Vax-based  and  uses  the  Unix  operating  system,  a  wideband  local  network,  and  a 
set  of  software  tools. 

This  article  describes  the  steps  that  led  to  the  creation  of  the  SPP,  summarizes  the  requirements  analyses 
on  which  the  SPS  is  based,  describes  the  components  which  make  up  the  SPS,  and  presents  our  conclusions. 

[Boeh86]  Overview:  The  spiral  model  of  software  development  and  enhancement  presented  here  provides  a  new 
framework  for  guiding  the  software  process.  Its  major  distinguishing  feature  is  that  it  creates  a  risk-driven 
approach  to  the  software  process,  rather  than  a  strictly  specification-driven  or  prototype-driven  process.  It 
incorporates  many  of  the  strengths  of  other  models,  while  resolving  many  of  their  difficulties. 

This  section  presents  a  short  historical  background  of  software  process  models  and  the  issues  they 
address.  Section  2  summarizes  the  process  steps  involved  in  the  spiral  model.  Section  3  illustrates  the  applica¬ 
tion  of  the  spiral  model  to  a  software  project,  using  the  TRW  Software  Productivity  Project  as  an  example.  Sec¬ 
tion  4  summarizes  the  primary  advantages,  challenges,  and  implications  involved  in  using  the  spiral  model,  and 
Section  5  presents  the  resulting  conclusions. 

[Boeh87]  Summary:  A  candidate  top  10  list  of  software  metric  relationships,  in  terms  of  their  value  in  industrial 
situations.  Here  they  are,  in  rough  priority  order: 

1.  Finding  and  fixing  a  software  problem  after  delivery  is  100  times  more  expensive  than  finding  and  fixing  it  dur¬ 
ing  the  requirements  and  early  design  phases. 

2.  You  can  compress  a  software  development  schedule  up  to  25  percent  of  nominal,  but  no  more. 

3.  For  every  dollar  you  spend  on  software  development  you  will  spend  two  dollars  on  software  maintenance. 

4.  Software  development  and  maintenance  costs  are  primarily  a  function  of  the  number  of  source  instructions  in 
the  product. 

5.  Variations  between  people  account  for  the  biggest  differences  in  software  productivity. 

6.  The  overall  ratio  of  computer  software  to  hardware  costs  has  gone  from  15:85  in  1955  to  85:15  in  1985,  and  it  is 
still  growing. 
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7.  Only  about  15  percent  of  software  product-development  effort  is  devoted  to  programming. 

8.  Software  systems  and  software  products  each  typically  cost  three  times  as  much  per  instruction  to  fully 
develop  as  does  an  individual  software  program.  Software-system  products  cost  nine  times  as  much. 

9.  Walkthroughs  catch  60  percent  of  the  errors. 

10.  Many  software  phenomena  follow  a  Pareto  distribution:  80  percent  of  the  contribution  comes  from  20  per¬ 
cent  of  the  contributors.  Some  examples:  20  percent  of  the  modules  contribute  80  percent  of  the  cost,  and  20 
percent  of  the  modules  contribute  80  percent  of  the  errors  (not  necessarily  the  same  ones). 

[Boot80]  Abstract:  The  concept  of  abstract  data  types  is  extended  to  associate  performance  information  with 
each  abstract  data  type  representation.  The  resulting  performance  abstract  data  type  contains  a  functional  part 
which  described  the  functional  properties  of  the  data  type  and  a  performance  part  which  describes  the  perfor¬ 
mance  characteristics  of  the  data  type.  The  performance  part  depends  upon  1)  the  algorithms  and  data  represen¬ 
tation  selected  to  represent  the  data  type,  2)  the  particular  machine  on  which  the  software  realization  of  the  data 
type  is  realized,  and  3)  the  statistical  properties  of  the  actual  data  represented  by  the  data  objects  involved  in  the 
data  type.  Methods  for  determining  the  necessary  information  to  specify  the  performance  part  of  the  representa¬ 
tion  are  discussed. 

[Boro72]  Abstract:  Some  consequences  of  the  Blum  axioms  for  step  counting  functions  are  investigated.  Com¬ 
plexity  classes  of  recursive  functions  are  introduced  analogous  to  the  Hartmanis-Steams  classes  of  recursive 
sequences.  Arbitrarily  large  “gaps”  are  shown  to  occur  throughout  any  complexity  hierarchy. 

[Boug86]  Abstract:  [The  authors]  present  a  method  and  a  tool  for  generating  test  sets  from  algebraic  data  type 
specifications.  [The  authors]  give  formal  definitions  of  the  basic  concepts  required  in  our  approach  of  functional 
testing.  Then  [the  authors]  discuss  the  problem  of  testing  algebraic  data  types  implementations.  This  allows  the 
introduction  of  additional  hypotheses  and  thus  the  description  of  the  method  is  based  on  logic  programming. 
Some  limitations  of  PROLOG  are  discussed  and  two  extensions  are  presented,  METALOG  and  SLOG,  which 
allow  good  implementations  of  [the  authors]  method. 

[Bowe79]  Abbreviated  Introduction:  This  article  addresses  the  integration  and  test  phase  by  surveying  military 
standards  for  software  quality,  proposed  quality  metrics,  and  techniques  that  evaluate  the  readiness  of  software 
for  acceptance  testing.  Some  of  the  techniques  discussed,  such  as  providing  test  result  visibility  to  the  customer, 
test  effectiveness,  and  regression  testing,  apply  to  any  software  life-cycle  phase. 

[Bowe80]  Summary:  A  standard  software  error  classification  is  viable  based  on  experimental  use  of  different 
schemes  on  Hughes-Fullerton  projects.  Error  classification  schemes  have  proliferated  independently  due  to 
varied  emphasis  on  depth  of  causal  traceability  and  when  error  data  was  collected.  A  standard  classification  is 
proposed  that  can  be  applied  to  all  phases  of  software  development.  It  includes  a  major  causal  category  for 
design  errors.  Software  error  classification  is  a  prerequisite  for  both  feedback  for  error  prevention  and  detec¬ 
tion,  and  for  prediction  of  residual  errors  in  operational  software. 

[Bowe83]  Abstract:  Software  metrics  (or  measurements)  which  are  used  to  indicate  and  predict  levels  of 
software  quality  were  extended  from  previous  research  to  include  considerations  for  distributed  computing  sys¬ 
tems.  Aspects  of  the  products  of  software  life-cycle  activities  which  could  affect  the  quality  levels  of  software, 
and  metrics  to  measure  them,  were  identified.  Two  new  quality  factors,  survivability  and  expandability,  were  vali¬ 
dated.  A  guidebook  for  Software  Quality  Measurement  was  produced  to  aid  in  setting  quality  goals,  applying 
metric  measurements,  and  making  quality  level  assessments.  New  metrics  for  interoperability  and  reusability 
were  also  included  in  the  guidebook. 

[Bowe85]  Abbreviated  Preface:  The  purpose  of  this  contract  was  to  (1)  consolidate  results  of  previous  RADC 
contracts  dealing  with  software  quality  measurement,  (2)  enhance  the  software  quality  framework,  and  (3) 
develop  a  methodology  to  enable  a  software  acquisition  manager  to  determine  and  specify  software  quality 
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factors  requirements.  We  developed  the  methodology  and  framework  elements  to  focus  on  an  Air  Force 
software  acquisition  manager  specifying  quality  requirements  for  embedded  software  that  is  part  of  a  command 
and  control  application.  TTiis  methodology  and  most  of  the  framework  elements  are  generally  useful  for  other 
applications  and  different  environments.  The  Final  Technical  Report  consists  of  three  volumes: 

•  Volume  I,  Specification  of  Software  Quality  Attributes  -  Final  Report. 

•  Volume  n,  Specification  of  Software  Quality  Attributes  -  Software  Quality  Specification  Guidebook. 

•  Volume  m,  Specification  of  Software  Quality  Attributes  -  Software  Quality  Evaluation  Guidebook. 

Volume  I  describes  the  results  of  research  efforts  conducted  under  this  contract,  including  recommenda¬ 
tions  for  integrating  quality  metrics  technology  into  the  Air  Force  software  acquisition  management  process, 
recommended  changes  to  Air  Force  software  acquisition  documentation,  and  summaries  of  software  quality 
framework  changes  and  specification  methodology  features. 

Volumes  n  and  m  describe  the  methodology  for  using  the  quality  metrics  technology  and  include  an  over¬ 
view  of  the  software  acquisition  process  using  this  technology  and  the  quality  framework.  Volume  n  describes 
methods  for  specifying  software  quality  requirements  and  addresses  the  needs  of  the  software  acquisition 
manager.  Volume  III  describes  methods  for  evaluating  achieved  quality  levels  of  software  products  and  describes 
the  needs  of  data  collection  and  analysis  personnel. 

Volume  II  also  describes  procedures  and  techniques  for  specifying  software  quality  requirements  in  terms 
of  quality  factors  and  criteria.  Factor  interrelationships,  relative  costs  to  develop  high  quality  levels,  and  an 
example  for  a  command  and  control  application  are  described.  Procedures  for  assessing  compliance  with  speci¬ 
fied  requirements  are  included. 

Volume  HI  also  describes  procedures  and  techniques  for  evaluating  achieved  quality  levels  of  software 
products.  Worksheets  for  collecting  metric  data  by  software  life-cycle  phase  and  scoresheets  for  scoring  each 
factor  are  provided  in  the  appendices.  Detailed  metric  questions  on  worksheets  are  nearly  identical  to  questions 
in  the  Software  Evaluation  Reports  proposed  as  part  of  the  STARS  measurement  data  item  descriptions. 

[Boye75]  Abstract:  SELECT  is  an  experimental  system  for  assisting  in  the  formal  systematic  debugging  of  pro¬ 
grams.  It  is  intended  to  be  a  compromise  between  an  automated  program  proving  system  and  the  current  ad  hoc 
debugging  practice,  and  is  similar  to  a  system  being  developed  by  King  et  al  of  IBM2.  SELECT  systematically 
handles  the  paths  of  programs  written  in  a  LISP  subset  that  includes  arrays.  For  each  execution  path  SELECT 
returns  simplified  conditions  on  input  variables  that  cause  the  path  to  be  executed,  and  simplified  symbolic 
values  for  program  variables  at  the  path  output.  For  conditions  which  form  a  system  of  linear  equalities  and  ine¬ 
qualities  SELECT  will  return  the  input  variable  values  that  can  serve  as  sample  test  data.  The  user  can  insert 
constraint  conditions,  at  any  point  in  the  program  including  the  output,  in  the  form  of  symbolically  executable 
assertions.  These  conditions  can  induce  the  system  to  select  test  data  in  user-specified  regions.  SELECT  can  also 
determine  if  the  path  is  correct  with  respect  to  an  output  assertion.  We  present  four  examples  demonstrating  the 
various  modes  of  system  operation  and  their  effectiveness  in  finding  bugs.  In  some  examples,  SELECT  was  suc¬ 
cessful  in  automatically  finding  useful  test  data.  In  others,  user  interaction  was  required  in  the  form  of  output 
assertions.  SELECT  appears  to  be  a  useful  tool  for  rapidly  revealing  program  errors,  but  for  the  future  there  is  a 
need  to  expand  its  expressive  and  deductive  power. 

[Boye79]  Abbreviated  Preface:  This  book  is  a  user’s  guide  to  a  computational  logic.  A  “computational  logic”  is 
a  mathematical  logic  that  is  both  oriented  towards  discussion  of  computation  and  mechanized  so  that  proofs  can 
be  checked  by  computation.  The  computational  logic  discussed  in  this  handbook  is  that  developed  by  Boyer  and 
Moore. 

This  handbook  contains  a  precise  and  complete  description  of  our  logic  and  a  detailed  reference  guide  to 
the  associated  mechanical  theorem  proving  system.  In  addition,  the  handbook  includes  a  primer  for  the  logic  as 


2.  IBM  is  a  registered  trademark  of  International  Business  Machines  Corp. 


206 


August  9,  1989 


a  functional  programming  language,  an  introduction  to  proofs  in  the  logic,  a  primer  for  the  mechanical  theorem 
prover,  stylistic  advice  on  how  to  use  the  logic  and  theorem  prover  effectively,  and  many  examples. 

The  logic  was  last  described  completely  in  our  book  A  Computational  Logic,  published  in  1979.  In  the 
eight  years  since  [this  book]  was  published,  the  logic  and  the  theorem  prover  have  changed.  On  two  occasions  we 
changed  the  logic,  both  times  concerned  with  the  problem  of  axiomatizing  an  interpreter  for  the  logic  as  a  func¬ 
tion  in  the  logic  but  motivated  by  different  applications. 

There  have  been  two  truly  important  changes  to  the  theorem  prover  since  1979,  neither  of  which  has  to  do 
with  additions  to  the  logic.  One  was  the  integration  of  a  linear  arithmetic  decision  procedure.  The  other  was  the 
addition  of  a  rather  primitive  facility  permitting  the  user  to  give  hints  to  the  theorem  prover. 

The  most  important  changes  have  occurred  not  in  the  logic  or  the  code  but  it  our  understanding  and  use  of 
them.  The  most  impressive  number  theoretic  result  proved  in  1979  was  the  existence  and  uniqueness  of  prime 
factorizations;  it  is  now  Gauss’s  law  of  quadratic  reciprocity.  The  most  impressive  metamathematical  result  was 
the  soundness  and  completeness  of  a  propositional  calculus  decision  procedure;  it  is  now  Godel's  incomplete¬ 
ness  theorem.  These  results  are  not  isolated  peaks  on  a  plain  but  just  the  highest  ones  in  ranges  explored  with  the 
system. 

[Boye80]  Abstract:  This  note  discusses  two  theorem-proving  questions  that  received  substantial  discussion  dur¬ 
ing  the  Workshop.  It  does  not  pretend  to  be  a  thorough  or  impartial  summary  of  every  significant  theorem-prov¬ 
ing  issue  raised. 

[Boye84a]  Abstract:  This  article  consists  of  three  parts:  a  tutorial  introduction  to  a  computer  program  which 
proves  theorems  by  induction;  a  brief  description  of  recent  applications  of  that  theorem-prover;  and  a  discussion 
of  several  nontechnical  aspects  of  the  problem  of  building  automatic  theorem-pro  vers.  The  theorem-prover 
described  has  proven  theorems  such  as  the  uniqueness  of  prime  factorisations,  Fermat’s  theorem  and  the  recur¬ 
sive  unsolvability  of  the  halting  problem. 

[Bran80]  Abstract:  Guidelines  are  given  for  program  testing  and  verification  to  insure  quality  software  for  the 
programmer  working  alone  in  a  computing  environment  with  limited  resources.  The  emphasis  is  on  verification 
as  an  integral  part  of  the  software  development.  Guidance  includes  developing  and  planning  testing  as  well  as  the 
application  of  other  verification  techniques  at  each  lifecycle  stage.  Relying  upon  neither  automated  tools  or  for¬ 
mal  quality  assurance  support,  the  guidelines  should  be  appropriate  for  applications  programmers  doing  small 
development  projects. 

[Brin73]  Summary:  A  central  problem  in  program  design  is  to  structure  a  large  program  such  that  it  can  be 
tested  systematically  by  the  simplest  possible  techniques.  This  paper  describes  the  method  used  to  test  the  RC 
4000  multiprogramming  system.  During  testing,  the  system  records  all  transitions  of  processes  and  messages 
between  various  queues.  The  test  mechanism  consists  of  fifty  machine  instructions  centralized  in  two  pro¬ 
cedures.  By  using  this  mechanism  in  a  series  of  carefully  selected  test  cases,  the  system  was  made  virtually  error 
free  within  a  few  weeks.  The  test  procedure  is  illustrated  by  examples. 

[Brin78]  Summary:  This  paper  describes  a  systematic  method  for  testing  monitor  modules  which  control  pro¬ 
cess  interactions  in  concurrent  programs.  A  monitor  is  tested  by  executing  a  concurrent  program  in  which  the 
processes  are  synchronized  by  a  clock  to  make  the  sequence  of  interactions  reproducible.  The  method  separates 
the  construction  and  implementation  of  test  cases  and  makes  the  analysis  of  a  concurrent  experiment  similar  to 
the  analysis  of  a  sequential  program.  The  implementation  of  a  test  program  is  almost  mechanical.  The  method, 
which  is  illustrated  by  an  example,  has  been  successfully  used  to  test  a  multicomputer  network  program  written 
in  Concurrent  Pascal. 

[Brin85]  Abstract:  Now  that  several  Ada  compilers  and  interpreters  have  been  validated,  increased  attention  is 
being  given  to  Ada  Programming  Support  Environments  and  the  tools  needed  for  Ada  program  development. 
This  paper  discusses  the  capabilities  needed  in  an  Ada  debugger  in  light  of  the  language’s  tasking  constructs,  and 
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presents  the  design  for  a  debugger  which  operates  in  concert  with  a  single-processor  Ada  interpreter.  This 
debugger  design  demonstrates  the  extensions  to  sequential  debugging  techniques  that  are  necessary  to  handle 
concurrency,  and  shows  that  significant  debugging  fimctionality  can  be  provided  even  without  the  inclusion  of 
automatic  error  diagnosis  methods.  The  issues  considered  here  include  isolation  of  effects  and  display  of  the  full 
dynamic  execution  status,  both  of  which  are  essential  to  diagnosis  of  concurrent  programs. 

[Brit88]  Conclusion:  As  we  develop  better  tools  for  recording  and  compiling  software  designs  and  code,  those 
who  think  about  and  practice  programming  will  take  greater  interest  in  the  more  obscure  aspects  of  a  program: 
its  intent,  meaning,  resilience,  and  developmental  history.  Although  the  problem  of  writing  correct  programs, 
especially  those  embedded  within  large  systems  or  products,  remains  largely  unsolved  in  practice,  the  situation  is 
improving.  We  can  use  inspections  to  further  the  investigation  into  how  correct  programs  are  constructed. 
Several  such  inspections  will  be  carried  out  to  determine  their  usefulness  and  refine  their  practice.  The  purpose 
of  incorporating  correctness  arguments  into  inspections  is  not  to  improve  inspections,  but  to  improve  program¬ 
ming.  This  is  not  a  modest  objective.  Steps  will  necessarily  be  small. 

[Broo75]  Abbreviated  Preface:  In  many  ways,  managing  a  large  computer  programming  project  is  like  managing 
any  other  large  undertaking-in  more  ways  than  most  programmers  believe.  But  in  many  other  ways  it  is  dif¬ 
ferent-in  more  ways  than  most  professional  managers  expect. 

Managing  OS/360  development  was  a  very  educational  experience,  albeit  a  very  frustrating  one.  The  team, 
including  F.M.  Trapnell  who  succeeded  [the  author]  as  manager,  has  much  to  be  proud  of.  The  system  contains 
many  excellencies  in  design  and  execution,  and  it  has  been  successful  in  achieving  widespread  use.  Certain  ideas, 
most  noticeably  device-independent  input-output  and  external  library  management,  were  technical  innovations 
now  widely  copied.  It  is  now  quite  reliable,  reasonably  efficient,  and  very  versatile.  The  effort  cannot  be  called 
wholly  successful,  however.  The  flaws  in  design  and  execution  pervade  especially  the  control  program,  as  dis¬ 
tinguished  from  the  language  compilers.  Furthermore,  the  product  was  late,  it  took  more  memory  than  planned, 
the  costs  were  several  times  the  estimate,  and  it  did  not  perform  very  well  until  several  releases  after  the  first. 

After  leaving  IBM  in  1965,  [the  author]  began  to  analyze  the  OS/360  experience  to  see  what  management 
and  technical  lessons  were  to  be  learned.  In  particular,  [the  author]  wanted  to  explain  the  quite  different  manage¬ 
ment  experiences  encountered  in  System/360  hardware  development  and  OS/360  software  development. 

My  own  conclusions  are  embodied  in  the  essays  that  follow,  which  are  intended  for  professional  program¬ 
mers,  professional  managers,  and  especially  professional  managers  of  programmers. 

Although  written  as  separable  essays,  there  is  a  central  argument  contained  especially  in  Chapters  2-7. 
Briefly,  [the  author]  believe  that  large  programming  projects  suffer  management  problems  different  in  kind  from 
small  ones,  due  to  division  of  labor.  [The  author]  believes  the  critical  need  to  be  the  preservation  of  the  concep¬ 
tual  integrity  of  the  product  itself.  These  chapters  explore  both  the  difficulties  of  achieving  this  unity  and 
methods  for  doing  so.  The  later  chapters  explore  other  aspects  of  software  engineering  management. 

[Broo80a]  Abstract:  The  application  of  behavioral  or  psychological  techniques  to  the  evaluation  of  program¬ 
ming  languages  and  techniques  is  an  approach  which  has  found  increased  applicability  over  the  past  decade.  In 
order  to  use  this  approach  successfully,  investigators  must  pay  close  attention  to  methodological  issues,  both  in 
order  to  insure  the  generalizability  of  their  findings  and  to  defend  the  quality  of  their  work  to  researchers  in  other 
fields.  Three  major  areas  of  methodological  concern,  the  selection  of  subjects,  materials,  and  measures,  are 
reviewed.  The  first  two  of  these  areas  continue  to  present  major  difficulties  for  this  type  of  research. 

[Brood I]  Introduction:  A  statistical  analysis  was  conducted  of  structured  programming  and  programmer  per¬ 
formance,  with  productivity  measured  as  lines  of  code  per  man-month.  The  study  findings  support  the  following 
productivity  hypotheses:  (1)  Increasing  the  complexity  of  programming  projects  tends  to  lower  productivity.  (2) 
The  use  of  structured  programming  results  in  increased  productivity.  (3)  Structured  programming  technology  has 
the  highest  payoff  for  severely  constrained  complex  projects,  the  improvement  ranging  from  200%  to  over  600% 
as  compared  to  similar  projects  using  conventional  technology.  The  study  tends  to  rule  out  the  possibility  that  the 
following  factors  could  be  responsible  for  the  higher  productivity  of  projects  using  structured  programming:  (1) 
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the  application  of  structured  programming  only  to  less  complex,  less  constrained  projects  that  would  be  likely  to 
exhibit  productivity  anyway;  (2)  the  assignment  of  more  experienced  programmers,  who  would  probably  exhibit 
higher  productivity,  to  projects  using  structured  programming;  and  (3)  a  “code  explosion”  (an  increase  in  the 
number  of  lines  of  code  produced  due  to  the  tabular  format  of  structured  programs). 

[BropgTj  Abstract:  As  Ada  is  introduced  into  new  environments,  both  managers  and  developers  need  to  under¬ 
stand  the  ways  in  which  the  decision  to  use  Ada  as  the  target  language  will  affect  the  software  development  lifecy¬ 
cle.  The  Flight  Dynamics  division  at  NASA  Goddard  Space  Flight  Center  is  involved  in  a  study  analyzing  the 
effects  of  Ada  on  the  development  of  their  software.  This  project  is  one  of  the  first  to  use  Ada  in  this  environ¬ 
ment.  In  the  study,  two  teams  are  each  developing  satellite  simulators  from  the  same  specifications,  one  in  Ada 
and  one  in  FORTRAN,  the  standard  language  in  this  environment.  This  paper  will  address  the  lessons  learned 
during  the  design  phase  including  the  effect  of  specifications  on  Ada-oriented  design,  the  importance  of  the 
documentation  style  for  the  chosen  design  method,  and  the  effects  of  Ada-oriented  design  on  the  software 
development  lifecycle.  It  is  hoped  that  the  issues  faced  in  this  project  will  show  more  clearly  what  may  be 
expected  in  designing  with  Ada-oriented  design  methods. 

[Brow72a]  Abbreviated  Introduction:  From  the  point  of  the  user,  a  reliable  computer  program  is  one  which  per¬ 
forms  satisfactorily  according  to  the  computer  program’s  specifications.  The  ability  to  determine  if  a  computer 
program  does  indeed  satisfy  its  specifications  is  most  often  based  upon  accumulated  experience  in  using  the 
software.  This  is  due  in  part  to  general  agreement  that  the  quality  of  computer  software  increases  as  the  software 
is  extensively  used  and  failures  are  discovered  and  corrected.  In  keeping  with  this  philosophy,  increasing 
emphasis  has  been  placed  on  exhaustive  testing  computer  programs  as  the  principal  means  of  assuring  sufficient 
quality. 

Nevertheless,  a  significant  problem  which  pervades  all  software  development  is  a  lack  of  knowledge  as  to 
how  much  testing  of  a  software  system  or  component  constitutes  sufficient  verification.  As  a  result,  we  often  lack 
sufficient  confidence  that  the  software  will  continue  to  operate  successfully  for  unanticipated  combinations  of 
data  in  a  real-world  environment. 

In  recognition  of  the  high  cost  and  uncertainty  of  software  verification,  TRW  Systems’  Product  Assurance 
Office  initiated  a  company-funded  effort  to  improve  upon  current  testing  methodology.  The  result  of  the  study, 
experimentation,  design  and  development  thus  far  conducted  comprises  the  TRW  Product  Assurance  Confi¬ 
dence  Evaluator  (PACE)  system,  an  evolving  collection  of  automated  tools  which  provide  support  in  various 
phases  of  software  testing. 

The  initial  PACE  instance  was  the  FLOW  program  to  support  test  evaluation  activities.  FLOW  monitors 
statement  usage  during  test  execution,  thus  providing  a  basic  evaluation  of  test  effectiveness.  In  addition,  FLOW 
supports  the  test  planning  activity  by  indicating  the  unexercised  code,  and  consequently,  the  additional  tests 
required  for  more  comprehensive  testing. 

[Brow75]  Abstract:  This  paper  presents  a  formulation  of  a  novel  methodology  for  evaluation  of  testing  in  sup¬ 
port  of  operational  reliability  assessment  and  prediction.  The  methodology  features  an  incremental  evaluation  of 
the  representativeness  of  a  set  of  development  and  validation  test  cases  together  with  definition  of  additional  test 
cases  to  enhance  those  qualities. 

If  test  cases  are  derived  in  typical  fashion  (i.e.,  to  find  and  remove  bugs,  to  investigate  software  perfor¬ 
mance  under  off-nominal  conditions,  to  exercise  structural  elements  and  functional  capabilities  of  the  software, 
and  to  demonstrate  satisfaction  of  software  requirements),  then  the  complete  set  of  test  cases  is  not  necessarily 
representative  of  anticipated  operational  usage.  The  paper  reports  on  initial  research  into  formulation  of  valid 
measures  of  testing  representativeness. 

Several  techniques  which  permit  specification  of  expected  operational  usage  are  described,  and  a  tech¬ 
nique  for  evaluating  the  correlation  between  actual  testing  accomplished  and  expected  operational  usage  is 
defined.  An  unbiased  estimator  for  operational  usage  reliability  is  proposed  and  justified  as  a  function  of  a  speci¬ 
fied  operational  profile;  confidence  in  the  estimate  is  derived  from  a  measure  of  the  degree  to  which  testing  is 
representative  of  expected  operational  application. 
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An  experimental  application  of  the  techniques  to  a  small  program  is  provided  as  an  illustration  of  the  pro¬ 
posed  use  of  the  methodology  for  operational  software  reliability  estimation.  The  relationship  between  structural 
exercise  testing  thoroughness  and  operational  usage  representativeness  is  discussed;  the  specification  of  a  quanti¬ 
fied  reliability  requirement  and  an  explicit,  required  representativeness  measure  (or  confidence)  is  identified  as 
integral  to  effective  application  of  the  proposed  reliability  testing  methodology;  efforts  to  extend,  formalize  and 
generalize  the  methodology  are  described;  and  expected  benefits,  as  well  as  potential  problems  and  limitations 
are  identified. 

[Brow78]  Abstract:  FAST  (Fortran  Analysis  System)  implements  a  powerful  set  of  analysis  capabilities  on  For¬ 
tran  source  language  programs.  Its  implementation  was  accomplished  through  the  integration  of  existing 
software  systems  and  by  the  use  of  modem  language  system  development  tools.  The  result  is  an  order  of  magni¬ 
tude  reduction  in  effort  of  implementation  coupled  with  a  sizable  increase  in  system  capabilities.  The  use  of  a 
general  purpose,  commercially  available  data  management  system  as  a  data  handler  and  data  correlator  is  a  dom¬ 
inant  factor  in  both  reduction  of  effort  of  implementation  and  generation  of  additional  power  and  flexibility  in 
the  analysis  capabilities  for  systematically  qualified  program  analyses  which  is  unique  among  existing  program 
analyzers.  This  capability  should  be  particularly  useful  in  the  program  maintenance  environment. 

[Brow89]  Abstract:  A  probabilistic  model  is  presented  which  determines  the  optimal  number  of  software  test 
cases  required  in  situations  where  the  following  can  be  estimated  as  independent  parameters:  1)  the  cost  per  test, 
2)  the  cost  per  error  if  undetected  until  field  implementation,  3)  the  number  of  software  executions  over  its  life¬ 
time,  4)  the  number  of  possible  different  executions,  and  5)  the  number  of  faults  embedded  in  the  software.  A 
formula  is  derived  by  the  use  of  calculus  which  is  solved  by  approximation  techniques.  Tables  of  optimal  number 
of  tests  over  a  range  of  parameter  values  are  presented  to  illustrate  the  results.  The  model  serves  as  a  basis  to 
crystallize  further  research  efforts  to  improve  the  accuracy  of  input  variable  estimation. 

[Brue83]  Abstract:  This  paper  introduces  a  modified  version  of  path  expressions  called  Path  Rules  which  can  be 
used  as  a  debugging  mechanism  to  monitor  the  dynamic  behavior  of  a  computation.  Path  rules  have  been  imple¬ 
mented  in  a  remote  symbolic  debugger  running  on  the  Three  Rivers  Computer  Corporation  PERQ  computer 
under  the  Accent  operating  system. 

[Brya80]  Abstract:  This  paper  discusses  the  application  of  software  product  assurance  to  actual  on-going  pro¬ 
jects.  Several  facets  of  software  product  assurance  are  presented  in  terms  of  their  application  to  real-life  situa¬ 
tions.  The  performance  of  product  assurance  usually  enhances  product  integrity.  This  benefit  is  obtained  to  a 
lesser  or  greater  degree  regardless  of  when  product  assurance  is  first  introduced  in  a  project. 

[Bryk89]  Preface:  The  purpose  of  IDA  Memorandum  M-496,  Bibliography  of  Testing  and  Evaluation  Reference 
Material,  is  to  present  the  reference  material  acquired  in  the  course  of  developing  IDA  Paper  P-2132,  SDS  Test¬ 
ing  and  Evaluation:  A  Review  of  the  State-of-the-Art  in  Software  Testing  and  Evaluation  With  Recommended  R&D 
Tasks.  This  document  was  prepared  for  the  Strategic  Defense  Initiative  Organization  (SDIO). 

[Bnck79]  Abstract:  The  increasing  criticality  of  software  mandates  a  standard  for  software  quality  assurance 
plans.  Such  a  standard,  developed  by  the  Computer  Society  Software  Engineering  Standards  Subcommittee, 
appears  here. 

[Budd78a]  Introduction:  When  testing  software  the  major  question  which  must  always  be  addressed  is  “If  a  pro¬ 
gram  is  correct  for  a  finite  number  of  test  cases,  can  we  assume  it  is  correct  in  general.”  Test  data  which  possess 
this  property  is  called  Adequate  test  data,  and,  although  adequate  test  data  cannot  in  general  be  derived  algo¬ 
rithmically,  several  methods  have  recently  emerged  which  allow  one  to  gain  confidence  in  one’s  test  data  ade¬ 
quacy. 

Program  mutation  is  a  radically  new  approach  to  determining  test  data  adequacy  which  hold  promise  of 
being  a  major  breakthrough  in  the  field  of  software  testing.  The  concepts  and  philosophy  of  program  mutation 
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have  been  given  elsewhere,  the  following  will  merely  present  a  brief  introduction  to  the  ideas  underlying  the  sys¬ 
tem. 

Unlike  previous  work,  program  mutation  assumes  that  competent  programmers  will  produce  programs 
which,  if  they  are  not  correct,  are  “almost”  correct.  That  is,  if  a  program  is  not  correct  it  is  a  “mutant”  -  it 
differs  from  a  correct  program  by  simple  errors.  Assuming  this  natural  premise,  a  program  P  which  is  correct  on 
test  data  T  is  subjected  to  a  series  of  mutant  operators  to  produce  mutant  programs  which  differ  from  P  in  very 
simple  ways.  The  mutants  are  then  executed  on  T.  If  all  mutants  give  incorrect  results  then  it  is  very  likely  that  P 
is  correct  (i.e.,  T  is  adequate).  On  the  other  hand,  if  some  mutants  are  correct  on  T  then  either:  (1)  the  mutants 
are  equivalent  to  P,  or  (2)  the  test  data  T  is  inadequate.  In  the  latter  case,  T  must  be  augmented  by  examining  the 
non-equivalent  mutants  which  are  correct  on  T:  a  procedure  which  forces  close  examination  of  P  with  respect  to 
the  mutants. 

At  first  glance  it  would  appear  that  if  T  is  determined  adequate  by  mutation  analysis,  then  P  might  still 
contain  some  complex  errors  which  are  not  explicitly  mutants  of  P.  To  this  end  there  is  a  COUPLING  EFFECT 
which  states  that  test  data  in  which  all  simple  mutants  fail  is  so  sensitive  that  it  is  highly  likely  that  all  complex 
mutants  must  also  fail. 

[Budd80a]  Abbreviated  Introduction:  In  [this  paper]  we  will  present  two  types  of  theoretical  results  concerning 
the  questions:  1)  If  a  program  P  is  written  by  a  competent  programmer  and  if  P  passes  the  <j>  mutant  test  with  test 
data  D,  does  the  function  actually  computed  by  P  equal  the  partial  recursive  function  that  specifies  the  intended 
behavior  of  P?  2)  (Coupling  Effect):  If  P  passes  the  subset  of  ft  mutant  test  with  data  D,  does  P  pass  the  4>  mutant 
test  with  data  D?  General  results  are  expressed  in  terms  of  properties  of  the  language  class  L,  and  specific  results 
for  a  class  of  decision  table  programs  and  for  a  subset  of  LISP.  Portions  of  the  work  on  decision  tables  and  LISP 
have  appeared  elsewhere,  but  the  presentations  given  here  are  both  simpler  and  more  unified.  In  the  final  section 
we  present  a  system  for  applying  program  mutation  to  FORTRAN  and  we  introduce  a  new  type  of  software 
experiment,  called  a  “beat  the  system”  experiment,  for  evaluating  how  well  our  system  approximates  an  affirma¬ 
tive  response  to  the  program  mutation  questions. 

[Budd80d]  Abstract:  We  consider  two  interpretations  for  what  it  means  for  test  data  to  demonstrate  correctness. 
For  each  interpretation,  we  examine  under  what  conditions  data  sufficient  to  demonstrate  correctness  exists,  and 
whether  it  can  be  automatically  detected  and/or  generated.  We  establish  the  relation  between  these  questions  and 
the  problem  of  deciding  equivalence  of  two  programs. 

[Budd85]  Abstract:  Both  theoretical  and  empirical  arguments  suggest  that  specifications  and  implementations 
are  equally  important  sources  of  information  for  generating  test  cases.  Nevertheless,  the  majority  of  test  genera¬ 
tion  procedures  described  in  the  literature  deal  only  with  the  program  source,  ignoring  specifications.  In  this 
paper  we  outline  a  procedure  for  measuring  test  case  effectiveness  using  specifications  given  in  predicate  calculus 
form.  This  method  is  similar  to  the  mutation  analysis  method  of  testing  programs. 

[Bunc80]  Abstract:  An  approach  to  analyzing  the  interaction  of  hardware  failure  modes  with  computer  software 
is  described.  The  approach  considers  the  software  requirements,  not  the  design  or  implementation  and  is  an 
extension  of  the  FMEA  (failure  mode  and  effects  analysis)  discipline.  It  has  been  developed  to  address  the  needs 
of  the  Space  Shuttle  Orbiter  Project  and  is  being  applied  to  Orbiter  subsystems.  The  basic  approach  is  applicable 
to  other  hardware/software  systems,  and  guidelines  for  its  application  are  presented. 

[Bnrs74]  Abstract:  A  method  of  proving  facts  about  programs  is  presented  in  an  informal  manner,  in  the  hope 
that  it  will  have  some  intuitive  appeal  to  programmers.  It  derives  essentially  from  Manna’s  method,  but  it  is  influ¬ 
enced  by  the  recent  idea  of  “executing”  a  program  symbolically  as  part  of  the  proof  process.  Some  examples  are 
worked  out,  including  one  to  invert  a  permutation  in  situ  and  one  to  traverse  a  tree;  the  latter  seems  to  come  out 
rather  easily  this  way.  Finally  this  technique  and  the  Floyd  one  are  related  to  a  system  of  modal  logic. 

[Cail79]  Comment:  In  his  article  “A  Controlled  Experiment  in  Program  Testing  and  Code 
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Walkthroughs/Inspections ,  ”  Glenford  J.  Myers  states  that  the  overall  results  showing  people’s  ability  to  find 
errors  are  rather  dismal.  Dr.  Myers  does  not  indicate  whether  anv  of  the  59  participants  complained  about  the 
program’s  style  and  the  tricks  it  employs.  It  this  were  not  the  case,  then  not  only  [is  the  author]  not  surprised  at 
the  result,  but  [he  is]  also  quite  shocked  -  it  would  show  the  participants’  meek  acceptance  of  tricks  and  bad  style 
as  “normal.” 

One  may  wonder  if  it  is  not  less  painful  to  sit  down  and  code  the  program  anew  and  more  reliably  in  the 
time  spent  on  tracking  down  the  errors.  That  experiment  has  not  been  done,  and  it  would  be  interesting  to  know 
its  results. 

It  is  also  not  obvious  whether  some  of  the  errors  listed  ought  not,  in  fact,  to  be  classified  as  omissions 
from  the  specification;  for  example,  from  the  specification  it  follows  that  a  tab  character  is  not  a  break  character, 
and  nowhere  is  there  any  mention  of  how  many  printing  positions  an  output  line  may  occupy.  Clearly  the  fact  that 
a  tab  is  a  single  character  and  the  fact  that  it  occupies  a  number  of  printing  positions  on  a  certain  device  are  two 
different  things.  A  formatting  program  is  especially  sensitive  to  the  character  set  and  to  the  effects  characters 
have  on  the  output  device. 

[Camp79]  Abstract:  This  paper  describes  the  enhancement  of  Pascal  to  specify  synchronization  between  con¬ 
current  processes  by  path  expressions.  The  extended  language  is  being  used  to  gain  experience  in  the  design  and 
construction  of  practical  real-time  systems  and  operating  systems.  An  encapsulation  mechanism  is  included  to 
synchronize  all  accesses  to  encapsulate  data.  A  network  message  transfer  system  is  presented  as  an  extended 
example  of  the  use  of  path  expressions. 

[Card86a]  Abstract:  Software  engineers  have  developed  a  large  body  of  software  design  theory  and  folklore, 
much  of  which  has  never  been  validated.  This  paper  reports  the  results  of  an  empirical  study  of  software  design 
practices  in  one  specific  environment.  The  practices  examined  affect  module  size,  module  strength,  data  cou¬ 
pling,  descendant  span,  unreferenced  variables,  and  software  reuse.  Measures  characteristic  of  these  practices 
were  extracted  from  887  Fortran  modules  developed  for  five  flight  dynamics  software  projects  monitored  by  the 
Software  Engineering  Laboratory.  The  relationship  of  these  measures  to  cost  and  fault  rate  was  analyzed  using  a 
contingency  table  procedure.  The  results  show  that  some  recommended  design  practices,  despite  their  intuitive 
appeal  are  ineffective  in  this  environment,  whereas  others  are  very  effective. 

[Card87a]  Abstract:  The  theory  of  software  science  proposed  by  Halstead  appears  to  provide  a  comprehensive 
model  of  the  program  construction  process.  Although  software  science  has  been  widely  criticized  on  theoretical 
grounds,  its  measures  continue  to  be  used  because  of  apparently  strong  empirical  support.  This  study  reexam¬ 
ined  one  basic  relationship  proposed  by  the  theory:  that  between  estimated  and  actual  program  length.  The 
results  show  that  the  apparent  agreement  between  these  quantities  is  a  mathematic  artifact.  Analyses  of  both 
Halstead’s  own  data  and  another  larger  dataset  confirm  this  conclusion.  Software  science  has  neither  a  firm 
theoretical  nor  empirical  foundation. 

[Card87b]  Abstract:  Many  new  software  development  practices,  tools,  and  techniques  have  been  introduced  in 
recent  years.  Few,  however,  have  been  empirically  evaluated.  The  objectives  of  this  study  were  to  measure  tech¬ 
nology  use  in  a  production  environment,  develop  a  statistical  model  for  evaluating  the  effectiveness  of  technolo¬ 
gies,  and  evaluate  the  effects  of  some  specific  technologies  on  productivity  and  reliability.  A  carefully  matched 
sample  of  22  projects  from  the  Software  Engineering  Laboratory  database  was  studied  using  an 
analysis-of-covariance  procedure.  Limited  use  of  the  technologies  considered  in  the  analysis  produced  approxi¬ 
mately  a  30  percent  increase  in  software  reliability.  These  technologies  did  not  demonstrate  any  direct  effect  on 
development  productivity. 

[Care77]  Abstract:  This  paper  presents  a  software  testing  and  Quality  Assurance  technology  based  on  a  special 
set  of  development  methodologies.  A  specific  example  employing  a  top-down  design  process  is  explored  in  depth 
to  demonstrate  traceability  from  requirements  to  system  test.  The  peripheral  advantages  of  this  technology  are 
also  explored. 
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[Carp75]  Abstract:  DECA  is  a  computer  program  which  is  use  in  conjunction  with  a  top-down  dominated 
design  methodology.  The  program  organizes,  validates,  and  produces  a  document  depicting  the  design  of  a 
software  system.  The  use  of  DECA  significantly  enhances  the  quality  of  the  software  design.  The  quality  of  the 
design  in  turn  significantly  benefits  the  quality  of  the  implemented  software  system. 

[Carr88]  Abbreviation:  One  general  approach  to  detecting  synchronization  errors,  called  static  analysis,  is  to 
analyze  (not  execute)  the  program  to  derive  an  approximation  of  the  feasibility  set  of  a  concurrent  program.  A 
number  of  (static)  analysis  techniques  have  been  developed  for  detecting  synchronization  errors.  Generally, 
these  analysis  techniques  derive  an  approximation  set  which  is  the  set  of  syntactically  possible  SYN-sequences; 
such  techniques  are  referred  to  as  syntax-based  synchronization  analysis  techniques. 

We  have  developed  a  new  approach  to  analyzing  concurrent  programs,  which  is  to  derive  constraints  on 
the  feasible  SYN-sequences  of  a  concurrent  program  according  to  the  program’s  syntactic  and  semantic  informa¬ 
tion.  These  constraints,  called  feasibility  constraints  or  ( constraints  if  there  is  no  ambiguity),  show  restrictions  on 
the  ordering  of  synchronization  events  allowed  by  the  program.  By  using  feasibility  constraints,  we  can  obtain  a 
better  approximation  of  the  feasibility  set  of  a  concurrent  program  and  improve  the  effectiveness  of  error  detec¬ 
tion  by  static  analysis. 

[Cava78]  Abstract:  Research  in  software  metrics  incorporated  in  a  framework  established  for  software  quality 
measurement  can  potentially  provide  significant  benefits  to  software  quality  assurance  programs.  The  research 
described  has  been  conducted  by  General  Electric  Company  for  the  Air  Force  Systems  Command  Rome  Air 
Development  Center.  The  problems  encountered  defining  software  quality  and  the  approach  taken  to  establish  a 
framework  for  the  measurement  of  software  quality  are  described  in  this  paper. 

[Cha88]  Abstract:  MURPHY  is  a  language-independent,  experimental  methodology  for  building  safety-critical, 
real  time  software,  which  will  include  an  integrated  tool  set.  Using  Ada  as  an  example,  this  paper  presents  a  tech¬ 
nique  for  verifying  the  safety  of  complex,  real-tune  software  using  Software  Fault  Tree  Analysis.  The  templates 
for  Ada  are  presented  along  with  an  example  of  applying  the  technique  to  an  Ada  program.  The  tools  in  the 
MURPHY  tool  set  to  aid  in  this  type  of  analysis  are  described. 

[Chan73]  Abstract:  The  notion  of  a  program  structure  has  inspired  several  authors  to  describe  techniques  for 
producing  programs  that  have  “good”  structure.  These  techniques,  however,  do  not  include  definitions  for  pro¬ 
gram  structure  or  good  structure.  It  is  simple  asserted  that  programs  produced  using  these  techniques  either  have 
good  structure  or  are  more  likely  to  have  good  structure  than  programs  produced  without  using  these  techniques. 
Instead,  good  structure  has  been  characterized  by  certain  properties.  For  example,  the  work  of  Dijkstra  and 
Pamas  as  well  as  the  work  of  Simon  and  Alexander,  concerning  complexity  in  systems,  suggest  that  programs  or 
a  system  of  programs  having  good  structure  possess  several  properties. 

[Chan84]  Abstract:  A  programmer  often  writes  and  tests  programs  in  a  bottom-up  manner,  producing  code 
fragment  and  testing  each  fragment  on  a  few  examples  to  convince  himself  that  the  program  works  so  far.  These 
intermediate  tests  are  typically  lost  without  full  utilization.  The  objective  of  this  project  is  to  create  a  kind  of 
information  retrieval  system  for  test  cases  to  remedy  this  situation.  The  “program  testing  assistant”  described 
herein  is  intended  to  aid  BASIC-PLUS  programmers  during  incremental  program  development.  As  in  the  pro¬ 
duction  of  any  piece  of  software  tool,  issues  of  ease  of  use  and  user-friendliness  are  of  main  concern  in  our  test¬ 
ing  assistant,  along  with  software  engineering  considerations  such  as  maintainability  and  reliability. 

[Chan85]  Introduction:  The  importance  of  software  testing  in  high-reliability,  real-time  applications  such  as 
telecommunications  switching  cannot  be  overemphasized.  The  software  of  these  systems  must  meet  high  reliabil¬ 
ity  requirements  while  supporting  a  wide  range  of  functions. 

Requirements  validation  depends  largely  on  the  techniques  used  for  specification,  so  the  augmented  finite- 
state  machine  (FSM)  model  used  to  capture  the  external  behavior  of  a  real-time  system  is  central  to  the  environ¬ 
ment.  Therefore,  this  paper  describes  the  validation  techniques  used  and  explores  those  aspects  of  the 
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specification  model  that  facilitate  test  generation  and  execution  using  the  behavioral  description.  We  refer  to  this 
theme  as  requirements  modeling  for  testability. 

Our  thesis  is:  although  an  FSM  or  an  augmented  FSM  may  be  relatively  limited  in  its  ability  to  capture  the 
whole  range  of  practical  systems’  behaviors,  it  is  adequate  for  real-time  systems  in  which  sequential  computa¬ 
tions  dominate.  The  FSM  model  should  be  preferred  for  applications  in  which  high  reliability  is  a  primary  con¬ 
cern;  expressive  power  can  be  traded  off  to  ensure  quality. 

[Chan89]  Abbreviated  Introduction:  System  performance  concerns  most  system  analysts.  To  model  perfor¬ 
mance,  you  need  a  practical  modeling  method  tailored  to  your  environment.  Existing  modeling  methods,  which 
can  model  either  queuing  behavior  or  asynchronous  concurrent  behavior,  are  inadequate  for  today’s  complex 
systems.  Many  applications,  including  multiprocessor  operating  systems  and  distributed  systems,  require  both 
modeling  capabilities. 

We  present  an  approach  that  uses  an  enhanced  model  based  on  two  familiar  modeling  methods,  the  queu¬ 
ing  network  and  petri  net.  Our  approach  includes  a  graphical  modeling  tool,  called  TPQN,  a  textual  specification 
language,  called  TPQL,  and  a  simulator,  called  TPQS. 

TPQN  can  represent  a  system  with  both  synchronization  conditions  and  queuing  behaviors,  something 
that  cannot  be  modeled  with  either  the  petri  net  or  queuing  network  alone.  We  illustrate  TPQN’s  capabilities  with 
a  performance  analysis  of  a  real-time  multitasking  scheduler  that  had  been  previously  implemented  on  top  of 
SunOS  on  a  Sun-3  workstation. 

We  have  conducted  some  experiments  with  the  TPQS  simulator  on  a  Sun-3.  The  results  have  helped  us  to 
analyze  the  system’s  performance  and  validate  our  modeling  tool. 

[Chap79]  Introduction:  Intuition  and  common  sense  generally  agree  that  software  which  appears  simple  is  supe¬ 
rior  to  software  that  appears  complex,  whatever  the  inherent  complexity  of  the  job.  This  position  is  in  fact  incor¬ 
porated  in  the  appraisal  guidelines  of  structured  desigh  as  “simplicity.”  But  applying  intuition  and  common 
sense  is  not  really  sufficient  to  obtain  consistently  simple  software.  What  is  needed  is  objective,  quantitative,  reli¬ 
able,  valid  and  convenient  ways  of  measuring  either  the  complexity  or  the  simplicity  in  software.  To  that  end,  a 
number  of  proposals  have  been  advanced. 

This  paper  proposes  an  alternative  measure  of  software  complexity.  The  background  of  the  measure  is 
briefly  given,  and  its  computational  procedure  described.  Then  it  is  applied  to  a  given  software  design  of  a  small 
modular  structured  program.  Afterward,  the  measure  is  compared  with  other  alternative  measures  and  with  pro¬ 
grammer  ratings  of  the  program.  The  paper  closes  with  a  discussion  of  the  validity  of  the  proposed  measure  of 
software  complexity. 

[Chap82]  Abstract:  This  paper  describes  the  design  and  implementation  of  a  program  testing  assistant  which 
aids  a  programmer  in  the  definition,  execution,  and  modification  of  test  cases  during  incremental  program 
development.  The  testing  assistant  helps  in  the  interactive  definition  of  test  cases  and  executes  them  automati¬ 
cally  when  appropriate.  It  modifies  test  cases  to  preserve  their  usefulness  when  the  program  they  test  undergoes 
certain  types  of  design  changes.  The  testing  assistant  acts  as  a  fully  integrated  part  of  the  programming  environ¬ 
ment  and  cooperates  with  existing  programming  tools  such  as  a  display  editor,  compiler,  interpreter,  and 
debugger. 

[Chea79]  Abstract:  Symbolic  evaluation  is  a  form  of  static  program  analysis  in  which  symbolic  expressions  are 
used  to  denote  the  values  of  program  variables  and  computations.  It  does  not  require  the  user  to  specify  which 
path  at  a  conditional  branch  to  follow  nor  how  many  cycles  of  a  loop  to  consider.  Instead,  a  symbolic  evaluator 
uses  conditional  expressions  to  represent  the  uncertainty  that  arises  from  branching  and  develops  and  attempts 
to  solve  recurrence  relations  that  describe  the  behavior  of  loop  variables. 

We  describe  a  symbolic  evaluator  for  part  of  the  ELI  language,  with  particular  emphasis  on  techniques  for 
handling  conditional  data  sharing  patterns,  the  behavior  of  array  variables,  and  the  behavior  of  variables  in  loops 
and  during  procedure  calls.  An  expression  simplifier,  which  is  the  heart  of  the  system,  is  described  in  some 
detail.  Potential  applications  of  the  symbolic  evaluator  to  problems  in  program  validation,  verification,  and 
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optimization  are  mentioned. 

[Cheh81]  Abstract:  Four  automated  specification  and  verification  environments  are  surveyed  and  compared: 
HDM,  FDM,  Gypsy,  and  AFFIRM.  The  emphasis  of  the  comparison  is  on  the  way  these  systems  could  be  used 
to  prove  security  properties  of  an  operating  system  design. 

[Chen78a]  Abstract:  This  paper  proposes  a  measure  of  program  control  complexity  from  an  information  theory 
viewpoint.  A  set  of  empirical  data  showing  programmer  productivity  as  a  function  of  program  control  complexity 
is  also  presented.  The  data  reveals  a  step-function-like  contour  to  programmer  productivity  with  increasing  pro¬ 
gram  control  complexity. 

[Chen81]  Abstract:  The  development  of  quantitative  measures  to  evaluate  software  development  techniques  is 
necessary  if  we  are  going  to  develop  appropriate  methodologies  for  software  production.  Data  is  collected  by  the 
Software  Engineering  Laboratory  at  NASA  Goddard  Space  Flight  Center  on  developing  medium  scale  projects 
of  up  to  ten  man  years  effort.  In  this  study,  cluster  analysis  was  used  on  this  collected  data  and  several  measures 
are  proposed.  These  measurements  are  objective,  quantifiable,  are  the  results  of  the  methodology,  and  most 
important,  seem  relevant. 

[Chen83]  Abstract:  Computations  of  distributed  systems  are  extremely  difficult  to  specify  and  verify  using  tradi¬ 
tional  techniques  because  the  systems  are  inherently  concurrent,  asynchronous,  and  nondeterministic.  Further¬ 
more,  computing  nodes  in  a  distributed  system  may  be  highly  independent  of  each  other,  and  the  entire  system 
may  lack  an  accurate  global  clock. 

In  this  paper,  we  develop  an  event-based  model  to  specify  formally  the  behavior  (the  external  view)  and 
the  structure  (the  internal  view)  of  distributed  systems.  Both  control-related  and  data-related  properties  of  distri¬ 
buted  systems  are  specified  using  two  fundamental  relationships  among  events:  the  “precedes”  relation, 
representing  time  order;  and  the  “enables”  relations,  representing  causualty.  No  assumption  about  the  existence 
of  a  global  clock  is  made  in  the  specifications. 

The  specification  technique  has  a  rather  wide  range  of  applications.  Examples  from  different  classes  of 
distributed  systems,  include  communication  systems,  process  control  systems,  and  a  distributed  prime  number 
generator,  are  used  to  demonstrate  the  power  of  the  technique. 

The  correctness  of  a  design  can  be  proved  before  implementation  by  checking  the  consistency  between 
the  behavior  specification  and  the  structure  specification  of  a  system.  Both  safety  and  liveness  properties  can  be 
specified  and  verified.  Furthermore,  since  the  specification  technique  defines  the  orthogonal  properties  of  a  sys¬ 
tem  separately,  each  of  them  can  then  be  verified  independently.  Thus,  the  proof  technique  avoids  the  exponen¬ 
tial  state-explosion  problem  found  in  state-machine  specification  techniques. 

[Cher80a]  Abbreviated  Introduction:  As  part  of  its  current  standards  initiative,  NBS  is  studying  methods  to 
ensure  the  quality  of  both  software  procured  by  the  government  and  software  developed  within  the  government. 
In  this  paper  we  discuss  the  use  of  programming  environments  in  developing  and  procuring  quality  software. 

Since  software  quality  cannot  be  assured  using  standard  control  methods,  we  propose  a  different 
approach  to  ensure  quality.  This  approach  rests  on  the  thesis  that  quality  in  the  software  product  can  be  achieved 
through  control  of  the  development  process.  Hence  we  propose  the  specification  of  software  quality  standards 
not  in  terms  of  properties  of  the  final  software  product  but  by  specifying  how  the  product  should  be  developed. 
The  use  of  development  tools  and  techniques  with  specific  properties,  e.g.,  the  use  of  a  design  specification  sys¬ 
tem  that  includes  data  flow  and  consistency  analysis,  would  be  standard  for  government  procured  software.  Pro¬ 
perties  of  the  processes  for  producing  requirements,  design,  code,  and  testing  would  be  specified  with  software 
quality  standards.  In  addition,  the  products  produced  at  each  development  stage  would  be  recorded  in  adherence 
to  documentation  standards.  The  various  pieces  of  the  development  process  when  incorporated  into  a  single  sys¬ 
tem  would  constitute  a  programming  environment. 

[Cher80b]  Abstract:  This  paper  is  oriented  towards  those  quality  control  problems  peculiar  to  the  procurement 
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of  software.  We  discuss  the  deficiencies,  and  possible  corrections,  of  several  current  methodologies.  We  propose 
a  set  of  software  management  and  development  tools  for  software  quality  assurance  which  enables  better 
contractor-developer  communication  during  the  development.  The  paper  also  includes  a  discussion  of  how 
sophisticated  programming  environments  can  play  a  central  role  in  procured  software  development  and  a  discus¬ 
sion  of  the  associated  research  issues. 

[Cher86]  Abstract:  Inductive  inference,  the  automatic  synthesis  of  programs,  bears  certain  ostensible  relation¬ 
ships  with  program  testing.  For  inductive  inference,  one  must  take  a  finite  sample  of  the  desired  input/output 
behavior  of  some  program  and  produce  (synthesize)  an  equivalent  program.  In  the  testing  paradigm,  one  seeks  a 
finite  sample  for  a  function  such  that  any  program  (in  a  given  set)  which  computes  something  other  than  the 
object  function  differs  from  the  object  function  on  the  finite  sample.  In  both  cases,  the  finite  sample  embodies 
sufficient  information  to  isolate  the  desired  program  from  all  other  possibilities.  Techniques  from  inductive  infer¬ 
ence  are  used  to  investigate  the  theoretical  limits  of  program  testing  and  to  provide  techniques  for  effective  pro¬ 
gram  testing. 

[Cber87b]  Abstract:  Inductive  inference,  the  automatic  synthesis  of  programs,  bears  certain  ostensible  relation¬ 
ships  with  program  testing.  For  inductive  inference,  one  must  take  a  finite  sample  of  the  desired  input/output 
oehavior  of  some  program  and  produce  (synthesize)  an  equivalent  program.  In  the  testing  paradigm,  one  seeks  a 
finite  sample  for  a  function  such  that  any  program  (in  a  given  set)  which  computes  something  other  than  the 
object  function  differs  from  the  object  function  on  the  finite  sample.  In  both  cases,  the  finite  sample  embodies 
sufficient  knowledge  to  isolate  the  desired  program  from  all  other  possibilities.  These  relationships  are  investi¬ 
gated  and  general  recursion  theoretic  properties  of  testable  sets  of  functions  are  exposed. 

[Cher88]  Abstract:  In  this  paper  we  take  an  abstract  set  based  approach  to  testing.  With  this  approach  we  are 
able  to  discuss  testing  issues  which  are  totally  representation  free.  We  develop  a  game  theoretic  approach  to  test¬ 
ing  and  obtain  some  complexity  results  from  this  approach.  We  develop  a  notion  of  testing  in  the  limit  and  dis¬ 
cuss  alternative  definitions  of  testing. 

[Ches77]  Abstract:  In  this  paper  we  introduce  a  software  design  methodology  in  which  the  design  is  constantly 
being  evaluated  as  it  develops.  Our  thesis  is  that  proper  evaluation  methods  can  aid  a  designer  to  make  sure  that 
i)  his  specifications  are  consistent  with  his  intuition  (or  requirements);  ii)  the  quality  of  his  design  is  reasonable. 
Another  way  to  view  our  methodology  is  that  it  helps  a  designer  to  build  mock-up  models  of  his  design  for  evalua¬ 
tion  before  actual  construction  begins. 

Our  approach  is  intended  to  fulfill  three  goals: 

1. To  allow  the  execution  of  designs  as  programs  either  symbolically  or  with  a  specification  interpreter  which 
does  the  equivalent  of  a  hand  simulation. 

2.  To  determine  the  performance  characteristics  of  a  design. 

3.  To  evaluate  the  “quality”  of  a  design  and  aid  the  designer  in  choosing  alternative  designs. 

To  reach  these  goals,  we  are  developing  a  specification  language  for  defining  abstract  models  of  a  pro¬ 
posed  system.  Another  language  is  being  developed  to  document  the  hierarchical  design  process.  Finally,  some 
software  tools  to  aid  our  methodology  are  under  development,  including  an  interpreter  and  some  performance 
modeling  systems. 

[Choq86]  Abstract:  This  paper  deals  with  the  generation  of  test  data  sets  from  algebraic  data  type  specifications. 
We  base  ourselves  on  a  previous  work  where  basic  concepts  required  in  our  approach  of  functional  testing  were 
defined  and  where  a  general  method  for  generating  test  data  sets  was  elaborated.  Because  of  the  similarity 
between  a  conditional  equation  and  between  a  clause  in  a  logic  program,  logic  programming  appears  to  be  well 
adapted  to  implement  this  method  and  derive  automatically  test  data  from  a  specification.  But  some  limitations 
of  standard  logic  programming  interpreters  need  to  be  alleviated.  Especially,  a  logic  interpreter  handling  “con¬ 
straints”  is  necessary  to  apply  in  their  whole  generality  some  hypotheses  made  on  an  implementation  under  test. 
We  study  such  an  extension  of  a  Prolog  interpreter  and  explain  the  improvements  effected.  A  simple  example 
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and  a  realistic  one  are  given. 

[Chow78]  Abstract:  We  propose  a  method  of  testing  the  correctness  of  control  structures  that  can  be  modeled 
by  a  finite-state  machine.  Test  results  derived  from  the  design  are  evaluated  against  the  specification.  No  “execut¬ 
able”  prototype  is  required.  The  method  is  based  on  a  result  in  automata  theory  and  can  be  applied  to  software 
testing.  Its  error-detecting  capability  is  compared  with  that  of  other  approaches.  Application  experience  is  sum¬ 
marized. 

[Chri81]  Overview:  This  paper  provides  an  overview  of  a  new  approach  to  the  measurement  of  software.  The 
measurements  are  based  on  the  count  of  operators  and  operands  contained  in  a  program.  The  measurement 
methodologies  are  consistent  across  programming  language  barriers.  Practical  significance  is  discussed,  and 
areas  are  identified  for  additional  research  and  validation. 

[Chiy78]  Abstract:  The  purpose  of  this  research  was  to  examine  the  relationship  between  processing  charac¬ 
teristics  of  programs  and  experience  characteristics  of  programmers  and  program  development  time.  The  ulti¬ 
mate  objective  was  to  develop  a  technique  for  predicting  the  amount  of  time  necessary  to  create  a  computer  pro¬ 
gram.  The  fifteen  program  characteristics  hypothesized  as  being  associated  with  an  increase  in  programming 
time  required  are  objectively  measurable  from  preprogramming  specifications.  The  five  programmer  charac¬ 
teristics  are  experience-related  and  are  also  measurable  before  a  programming  task  is  begun.  Nine  program 
characteristics  emerged  as  major  influences  on  program  development  time,  each  associated  with  increased  pro¬ 
gram  development  time.  All  five  programmer  characteristics  were  found  to  be  related  to  reduced  program 
development  time.  A  multiple  regression  equation  which  contained  one  programmer  characteristic  and  four  pro¬ 
gram  characteristics  gave  evidence  of  good  predictive  power  for  forecasting  program  development  time. 

[Chur86]  Abstract:  A  procedure  for  evaluating  a  software  prototype  is  presented.  The  need  to  assess  the  proto¬ 
type  itself  arises  from  the  use  of  prototyping  to  demonstrate  the  feasibility  of  a  design  or  development  strategy. 
The  assessment  procedure  can  also  be  of  use  in  deciding  whether  to  evolve  a  prototype  into  a  complete  system. 
The  procedure  consists  of  identifying  evaluation  criteria,  defining  alternative  design  approaches,  and  ranking  the 
alternatives  according  to  the  criteria. 

[Chus87]  Abstract:  A  new  coverage  measure  is  proposed  for  efficient  and  effective  software  testing.  The  con¬ 
ventional  coverage  measure  for  branch  testing  has  such  defects  as  overestimation  of  software  quality  and  redun¬ 
dant  test  data  selection  because  all  branches  are  treated  equally.  These  problems  can  be  avoided  by  paying  atten¬ 
tion  to  only  those  branches  essential  for  path  testing.  That  is,  if  one  branch  is  executed  whenever  another  partic¬ 
ular  branch  is  executed,  the  former  branch  is  nonessential  for  path  testing.  This  is  because  a  path  covering  the 
latter  branch  also  covers  the  former  branch.  Branches  other  than  such  nonessential  branches  will  be  referred  to 
as  essential  branches. 

A  testing  tool  for  the  new  measure  is  developed  in  order  to  discriminate  essential  branches  from 
nonessential  branches  and  to  measure  the  coverage  rate  of  these  essential  branches.  By  using  this  tool,  it  is  ascer¬ 
tained  that  the  number  of  essential  branches  is  about  60  percent  of  all  branches. 

As  a  result,  the  new  measure  reduces  software  quality  overestimation  because  the  accumulative  curve  of 
the  new  measure  to  the  number  of  executed  test  data  is  closer  to  linearity  than  that  of  the  conventional  measure. 
Another  advantage  is  the  prevention  of  redundant  test  data  selection.  It  results  from  a  40  percent  reduction  in 
the  number  of  branches  to  be  monitored  and  is  confirmed  by  a  reasonable  algorithm  for  test  data  selection. 
Furthermore,  an  efficient  algorithm  for  redundancy  elimination  of  a  selected  test  data  set  is  presented. 

[Cinl75]  Contents:  Chapters  in  this  book  address  the  following  topics:  Probability  spaces  and  random  variables; 
Expectations  and  independence;  Bernoulli  processes  and  sums  of  independent  random  variables;  Poisson 
processes;  Markov  chains;  Limiting  behavior  and  applications  of  Markov  chains;  Potentials,  excessive  func¬ 
tions,  and  optimal  stopping  of  Markov  chains;  Markov  processes;  Renewal  theory;  and  Markov  renewal  theory. 
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[Ckur76b]  Abstract:  This  paper  describes  a  system  that  attempts  to  generate  test  data  for  programs  written  in 
ANSI  Fortran.  Given  a  path,  the  system  symbolically  executes  the  path  and  creates  a  set  of  constraints  on  the 
program’s  input  variables.  If  the  set  of  constraints  is  linear,  linear  programming  techniques  are  employed  to 
obtain  a  solution.  A  solution  to  the  set  of  constraints  is  test  data  that  will  drive  execution  down  the  given  path.  If 
it  can  be  determined  that  the  set  of  constraints  is  inconsistent,  then  the  given  path  is  shown  to  be  nonexecutable. 
To  increase  the  chance  of  detecting  some  of  the  more  common  programming  errors,  artificial  constraints  are 
temporarily  created  that  simulate  error  conditions  and  then  attempt  is  made  to  solve  each  augmented  set  of  con¬ 
straints.  A  symbolic  representation  of  the  program’s  output  variables  in  terms  of  the  program’s  input  variables  is 
also  created.  The  symbolic  representation  is  in  a  human  readable  form  the  facilitates  error  detection  as  well  as 
being  a  possible  aid  in  assertion  generation  and  automatic  program  documentation. 

[Clar78a]  Abstract:  An  overview  of  some  of  the  current  program  validation  techniques  is  given.  Though  a 
variety  of  such  techniques  exist,  it  is  now  commonly  agreed  that  program  testing  is  an  essential  part  of  the  pro¬ 
gram  development  process.  Two  testing  methodologies,  functional  and  structural,  are  described  and  the  case 
made  for  combining  both  methodologies.  Finally,  a  system  that  aids  in  structural  testing  is  described. 

[Clar83b]  Abstract:  Program  errors  can  be  considered  from  two  perspectives-cause  and  effect.  The  goal  of  pro¬ 
gram  testing  is  to  detect  errors  by  discovering  their  effects,  while  the  goal  of  debugging  is  to  search  for  the  associ¬ 
ated  cause.  In  this  paper,  we  explore  ways  in  which  some  of  the  results  of  testing  research  can  be  applied  to  the 
debugging  process.  In  particular,  computational  testing  and  domain  testing,  which  are  two  error-sensitive  test 
data  selection  strategies,  are  described.  Ways  in  which  these  selection  strategies  can  be  used  as  debugging  aids 
are  then  discussed. 

[Clar84]  Abstract:  Symbolic  evaluation  is  a  program  analysis  method  that  represents  a  program’s  computations 
and  domain  by  symbolic  expressions.  This  method  has  been  the  foundation  for  much  of  the  current  research  on 
software  testing.  Most  path  selection  and  test  data  selection  techniques,  which  are  two  of  the  primary  concerns 
of  testing  research,  require  the  information  provided  by  symbolic  evaluation.  Symbolic  evaluation  is  also 
employed  by  verification  techniques.  In  addition  to  formal  verification,  several  less  rigorous  verification  tech¬ 
niques  utilize  the  symbolic  expressions  created  by  symbolic  evaluation  to  certify  program  properties. 

In  this  paper,  the  general  symbolic  evaluation  method  is  explained.  Several  path  selection  and  test  data 
selection  techniques  that  utilize  the  information  provided  by  symbolic  evaluation  are  then  described.  Some 
informal  verification  techniques,  which  also  employ  this  information,  are  discussed.  Finally,  the  partition 
analysis  method,  which  uses  symbolic  evaluation  to  combine  both  testing  and  verification  is  described. 

[CIar85a]  Abstract:  A  number  of  path  selection  testing  criteria  have  been  proposed  throughout  the  years. 
Unfortunately,  little  work  has  been  done  on  comparing  these  criteria.  To  determine  what  would  be  an  effective 
path  selection  criteria  for  revealing  errors  in  programs,  we  have  undertaken  an  evaluation  of  these  criteria.  This 
paper  reports  on  the  results  of  our  evaluation  for  those  path  selection  criteria  based  on  data  flow  relationships. 
We  show  how  these  criteria  relate  to  each  other,  thereby  demonstrating  some  of  their  strengths  and  weaknesses. 

[Clar85b]  Abstract:  Symbolic  evaluation  is  a  program  analysis  method  that  represents  a  program’s  computations 
and  domain  by  symbolic  expressions.  In  this  paper  a  general  functional  model  of  a  program  is  presented  first. 
Then,  three  related  methods  of  symbolic  evaluation,  which  create  this  functional  description  from  a  program, 
are  described:  path-dependent  symbolic  evaluation  provides  a  representation  of  a  specified  path;  dynamic  sym¬ 
bolic  evaluation,  which  is  more  restrictive  but  less  costly  than  path-dependent  symbolic  evaluation,  is  a  data- 
dependent  method;  and  global  symbolic  evaluation,  which  is  the  most  general  yet  most  costly  method,  captures 
the  functional  behavior  of  an  entire  program  when  successful.  All  three  methods  have  been  implemented  in 
experimental  systems.  Some  of  the  major  implementation  concerns,  which  include  effectively  representing 
loops,  determining  path  feasibility,  dealing  with  compound  data  structures,  and  handling  routine  invocations,  are 
explained.  The  remainder  of  the  paper  surveys  the  range  of  applications  to  which  symbolic  evaluation  techniques 
are  being  applied.  The  current  and  potential  role  of  symbolic  evaluation  in  verification,  testing,  debugging, 
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optimization,  and  software  development  is  explored. 

[Clar86a]  Abstract:  A  number  of  path  selection  criteria  have  been  proposed  throughout  the  years.  Unfor¬ 
tunately,  little  work  has  been  done  on  comparing  these  criteria.  To  determine  what  would  be  an  effective  path 
selection  criterion  for  revealing  errors  in  programs,  we  have  undertaken  an  evaluation  of  these  criteria.  In  our 
initial  work,  we  compared  three  families  of  data  flow  path  selection  criteria  and  found  that  the  strongest  criteria 
in  these  families  are  incomparable  and  that,  in  two  of  the  families,  the  strongest  criteria  fail  to  satisfy  certain 
minimal  coverage  requirements.  In  this  paper,  we  provide  an  overview  of  our  previous  results.  We  then  introduce 
minor  changes  to  the  original  criteria  to  assure  that  they  each  satisfy  these  minimal  coverage  requirements  and 
show  how  these  modified  criteria  relate  to  each  other.  We  conclude  with  a  discussion  on  directions  for  future 
work  in  this  area. 

[Clar86c]  Abstract:  Tools  in  a  software  development  environment  often  manipulate  objects  that  are  instances  of 
attributed  graphs.  Moreover,  an  individual  attributed-graph  instance  may  be  manipulated  by  several  different 
tools  in  an  environment.  During  the  prototyping  phase  in  the  design  of  a  software  development  environment, 
experimentation  with  tools  may  dictate  changes  to  the  high-level  structure  of  an  attributed  graph  as  well  as 
changes  to  the  graph’s  underlying  representation.  We  have  developed  a  meta-tool  called  GRAPHITE  to  facili¬ 
tate  both  kinds  of  experimentation  while  minimizing  the  impact  of  that  experimentation  on  the  tools  in  the 
environment.  This  meta-tool  and  its  potential  contributions  to  an  experimental  effort  to  build  an  advanced  Ada 
software  development  environment  are  described  in  this  paper. 

[ClarMa]  Abstract:  Current  research  indicates  that  software  reliability  needs  to  be  achieved  through  the  careful 
integration  of  a  number  of  diverse  testing  and  analysis  techniques.  To  address  this  need,  the  Team  environment 
has  been  designed  to  support  the  integration  of  and  experimentation  with  an  ever  growing  number  of  software 
testing  and  analysis  tools.  To  achieve  this  flexibility,  we  exploit  three  design  principles:  component  technology  so 
that  common  underlying  functionality  is  recognized;  generic  realizations  so  that  these  common  functions  can  be 
instantiated  as  diversely  as  possible;  and  language  independence  so  that  tools  can  work  on  multiple  languages, 
even  allowing  some  tools  to  be  applicable  to  different  phases  of  the  software  lifecycle.  The  result  is  an  environ¬ 
ment  that  contains  building  blocks  for  easily  constructing  and  experimenting  with  new  testing  and  analysis  tech¬ 
niques.  Although  the  first  prototype  has  just  recently  been  implemented,  we  feel  it  demonstrates  how  modularity, 
genericity,  and  language  independence  further  extensibility  and  integration. 

[Clar88b]  Introduction:  It  is  clear  from  recent  research  that  to  achieve  highly  reliable  software  a  number  of  test¬ 
ing  techniques  will  need  to  be  effectively  automated  and  integrated  together  into  a  powerful  testing  system.  To 
achieve  this  goal  we  have  been  pursuing  two  research  directions.  One  direction  has  been  an  investigation  into 
which  testing  techniques  should  be  included  in  such  a  system.  As  part  of  this  effort  we  have  evaluated  several  dif¬ 
ferent  techniques  to  understand  their  strengths  and  weaknesses  [Clar85a,  Rich86b].  This  evaluation  has  led  to 
the  preliminary  development  of  a  model  for  integrating  testing  techniques  that  appears  quite  promising  for  tack¬ 
ling  this  difficult  task. 

The  second  research  direction  is  the  design  and  development  of  a  testing  system  that  can  support  the 
integration  of  various  testing  techniques.  Certainly  the  results  of  the  first  direction  will  have  a  major  impact  on 
the  second.  Many  of  the  basic  underlying  capabilities  of  various  testing  techniques  are  the  same,  however.  In  par¬ 
ticular,  most  rely  upon  symbolic  or  data  flow  information  such  as  could  be  gathered  by  symbolic  evaluation  or 
data  flow  analysis  tools.  Thus,  we  have  been  able  to  explore  the  design  of  a  testing  system  that  would  provide 
these  basic  testing  capabilities  as  well  as  hold  the  potential  for  supporting  the  integration  of  more  advanced  tech¬ 
niques  as  work  on  the  first  research  direction  has  progressed. 

This  document  reports  on  our  progress  to  date  on  designing  and  developing  the  testing  system.  The  next 
section  provides  a  high  level  description  of  the  system  architecture  and  gives  a  brief  overview  of  each  of  the 
major  components  and  their  interaction.  Then  each  ensuing  section  describes  one  component  of  the  system  and 
the  status  of  our  work  on  that  component.  The  appendices  contain  the  actual  design  documents  and  user  manu¬ 
als.  The  research  on  evaluating  and  integrating  advanced  testing  techniques  is  described  in  a  separate  document. 
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[Coch50]  Abbreviated  Preface:  Work  on  this  book  was  started  when  [the  authors]  were  members  of  the  staff  of 
Iowa  State  College.  At  that  time  requests  were  received  rather  frequently  from  research  workers.  Some  wanted 
advice  on  the  conduct  of  a  specific  experiment:  others,  who  had  decided  to  use  one  of  the  more  complex  designs 
that  have  been  discovered  in  recent  years,  asked  for  a  plan  or  layout  that  could  be  followed  during  the  experi¬ 
mental  operations.  Although  the  logical  principles  governing  the  subject  of  experimentation  are  admirably 
expounded  in  Fisher’s  book  The  Design  of  Experiments ,  these  requests  indicated  a  need  for  a  different  type  of 
book,  one  which  would  describe  in  some  detail  the  most  useful  of  the  designs  that  have  been  developed,  with 
accompanying  plans  and  an  account  of  the  experimental  situations  for  which  each  design  is  most  suitable. 

[Coch53]  Abbreviated  Preface:  This  book  was  developed  from  a  course  of  lectures  on  sample  survey  tech¬ 
niques.  The  purpose  of  the  book  is  to  present  a  reasonably  comprehensive  account  of  sampling  theory  as  it  has 
been  developed  for  use  in  sample  surveys,  with  sufficient  illustrations  to  show  how  the  theory  is  applied  in  prac¬ 
tice,  and  with  a  supply  of  exercises  to  be  worked  by  the  student. 

[Cohc77]  Summary:  This  paper  describes  a  language  for  studying  the  behaviour  of  programs,  based  upon  the 
data  collected  while  these  programs  are  executed  by  a  computer.  Besides  being  a  useful  tool  in  debugging,  the 
language  is  also  valuable  in  the  experimental  evaluation  of  the  complexity  of  algorithms,  in  studying  the  inter¬ 
dependence  of  conditionals  in  a  program  and  in  determining  the  feasibility  of  transporting  programs  from  one 
machine  to  another.  The  program  one  wishes  to  analyse  is  written  in  an  Algol  60-like  language;  when  the  program 
is  executed  it  automatically  stores,  in  a  data  base,  the  information  needed  to  answer  general  questions  about 
computational  events  which  occurred  during  execution.  This  information  consists  (basically)  of  the  list  of  labels 
passed  while  the  program  is  being  executed,  and  the  current  values  of  the  variables.  Since  the  list  of  labels  is 
describable  by  regular  expressions,  these  expressions  can  also  be  used  to  identify  specific  subparts  of  the  list  and 
therefore  allow  access  to  the  values  of  the  variables.  This  constitutes  the  basis  for  the  design  of  the  inquiry 
language.  The  user’s  questions  are  automatically  answered  by  a  processor  which  inspects  the  previously  gen¬ 
erated  data  base.  The  paper  also  presented  examples  of  the  use  of  the  language  and  describes  the  implementation 
of  its  processor. 

[Cohe82]  Abstract  Prototypes  are  built  for  a  variety  of  reasons.  This  paper  offers  an  alternative  to  the  use  of  a 
prototype  as  a  means  of  testing  a  specification  (i.e.  someone  who  “knows”  what  he  wants  compares  his  intuitive 
understanding  with  the  behavior  of  the  prototype  on  particular  test  cases).  The  alternative  is  symbolic  execution 
of  a  formal  specification,  i.e.  the  specification  is  the  prototype  and  its  behavior  determined  by  symbolic  execu¬ 
tion  rather  than  the  traditional  “concrete”  execution.  This  is  an  extension  of  the  approach  to  rapid  prototyping 
based  on  operational  specification  and  an  alternative  to  testing  prototypes  whether  manually  constructed  or 
developed  mechanically  from  such  an  operational  specification.  One  advantage  of  this  approach  is  that  the  pro¬ 
totype  need  not  be  built  at  all.  Of  course,  the  formal  specification  must  be  written,  but  this  is  often  necessary 
anyway,  especially  if  the  specifier  and  implementor  are  different  people.  A  more  important  advantage  arising 
from  symbolic  execution  is  that  a  large  subset  of  the  possible  behaviors  can  be  examined  at  once. 

[Come79]  Abstract:  In  this  paper  we: 

1.  discuss  the  need  for  quantitatively  reproducible  experiments  in  the  study  of  top-down  design; 

2.  propose  the  design  and  writing  of  tutorial  papers  as  a  suitably  general  and  inexpensive  vehicle; 

3.  suggest  the  software  science  parameters  as  appropriate  metrics; 

4.  report  two  experiments  validating  the  use  of  these  metrics  on  outlines  and  prose;  and 

5.  demonstrate  that  the  experiments  tended  toward  the  same  optimal  modularity. 

The  last  point  appears  to  offer  a  quantitative  approach  to  the  estimation  of  the  total  length  or  volume  (and 
the  mental  effort  required  to  produce  it)  from  an  early  stage  of  the  top-down  design  process.  If  results  of  these 
experiments  are  validated  elsewhere,  then  they  will  provide  basic  guidelines  for  the  design  process. 

[Conn87]  Abstract:  The  Ada  Software  Repository  (ASR)  is  a  collection  of  Ada  programs,  software  com¬ 
ponents,  information  files,  and  educational  material  that  resides  on  the  computer  known  as  SIMTEL20  on  the 
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Defense  Data  Network  (DDN),  a  world-wide  network  of  computer  networks  supported  by  the  US  Department 
of  Defense.  This  repository  has  been  accessible  to  any  host  computer  on  the  DDN  since  November  26, 1984,  and 
is  available  to  any  member  of  the  Ada  community  in  the  United  States  and  its  allies. 

The  Ada  Software  Repository  (ASR)  is  a  free  source  of  Ada  programs  and  information.  It  serves  two 
roles:  to  promote  the  exchange  and  use  of  Ada  programs  (including  reusable  software  components)  and  to  pro¬ 
mote  Ada  education  (by  providing  information  on  items  of  interest  to  the  Ada  community  and  by  providing 
examples  of  working,  useful  Ada  programs).  There  is  over  40M  bytes  of  source  code,  documentation,  and  infor¬ 
mation  files  in  the  ASR. 

[Cont86]  Abbreviated  Preface:  This  book  is  intended  to  be  used  by  both  practitioners  and  students  who  are  or 
who  expect  to  be  involved  in  managing  or  producing  software.  It  is  our  firm  belief  that  software  engineering  in 
general  and  software  metrics  in  particular  should  be  a  part  of  the  curriculum  of  all  computer  science  programs. 

Our  goal  is  that,  through  this  book,  many  readers  will  be  introduced  to  the  world  of  software  metrics  and 
models.  We  also  believe  that  this  material  will  serve  as  an  impetus  to  researchers  in  software  engineering  and 
software  developers  to  derive  new  metrics  and  models,  and  to  gather  data  to  help  confirm  the  utility  of  new  and 
existing  metrics  and  models. 

[Coolc82]  Abstract:  Software  metrics,  an  area  of  software  engineering,  is  concerned  with  various  measurements 
of  computer  software  and  its  development.  Software  metrics,  its  importance,  some  current  areas  of  investiga¬ 
tion,  and  problems  are  described.  An  annotated  bibliography  of  work  in  software  metrics  is  included. 

[Coul83]  Abstract:  Halstead  proposed  a  methodology  for  studying  the  process  of  programming  known  as 
software  science.  This  methodology  merges  theories  from  cognitive  psychology  with  theories  from  computer  sci¬ 
ence.  There  is  evidence  that  some  of  the  assumptions  of  software  science  incorrectly  apply  the  results  of  cogni¬ 
tive  psychology  studies.  Halstead  proposed  theories  relative  to  human  memory  models  that  appear  to  be  without 
support  from  psychologists.  Other  software  scientists,  however,  report  empirical  evidence  that  may  support 
some  of  those  theories.  This  anomaly  places  aspects  of  software  science  in  a  precarious  position.  The  three  con¬ 
victing  issues  discussed  in  this  paper  are  1)  limitations  of  short-term  memory  and  number  of  subroutine  parame¬ 
ters,  2)  searches  in  human  memory  and  programming  effort,  and  3)  psychological  time  and  programming  time. 

[Cox8I]  Abstract:  In  this  paper  [the  author]  describe  the  practical  problems  of  designing  a  regression  test  set  for 
an  existing  mini-computer  operating  system.  The  ideal  regression  test  would  test  each  function  with  all  possible 
combinations  of  the  options  for  each  variation  of  the  operating  system.  This  is  impractical  if  not  impossible  so 
the  alternative  is  to  choose  the  individual  cases  for  maximum  coverage.  To  do  that  the  system  is  viewed  both 
functionally  and  structurally  and  cases  are  selected  for  inclusion  in  the  test  set.  The  method  of  selecting  the  tests 
is  described  along  with  the  tools  that  will  be  needed  to  measure  the  coverage  and  to  maintain  the  test  set. 

[Crai88a]  Abstract:  The  Trusted  Systems  Group  of  I.P.  Sharp  Associates  Limited  has  recently  released  a  proto¬ 
type  formal  verification  system,  called  m-EVES.  m-EVES  consists  of  a  new  language,  called  m- Verdi,  for  imple¬ 
menting  and  specifying  software;  a  new  logic  (which  has  been  proven  sound);  and  a  new  theorem  prover,  called 
m-NEVER,  which  integrates  many  state-of-the-art  techniques  drawn  from  the  theorem  proving  literature. 

In  this  paper,  after  a  brief  overview  of  the  m-EVES  system,  an  application  of  m-EVES  to  a  proof  of  a  non¬ 
trivial  security  property  (non-interference)  for  a  pedagogical  computer  system  (the  Low  Water  Mark  system)  is 
discussed.  An  example  demonstrates  some  of  the  power  and  novel  features  of  m-EVES.  The  paper  concludes 
with  a  comparison  of  the  m-EVES  solution  with  similar  efforts  using  the  Gypsy  Verification  Environment  and  the 
Boyer-Moore  theorem  prover. 

[Cral88b]  Abstract:  This  paper  describes  the  development  of  a  new  tool  for  formally  verifying  software.  The 
tool  is  called  m-EVES  and  consists  of  a  new  language,  called  m- Verdi,  for  implementing  and  specifying  software; 
a  new  logic,  which  has  been  proven  sound;  and  a  new  theorem  prover,  called  m-NEVER,  which  integrates  many 
state-of-the-art  techniques  drawn  from  the  theorem  proving  literature.  Two  simple  examples  are  used  to  present 
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the  fundamental  ideas  embodied  within  the  system. 

[CnrrSti]  Abstract:  The  accepted  approach  to  software  development  is  to  specify  and  design  a  product  in 
response  to  a  requirements  analysis  and  then  to  test  the  software  selectively  with  cases  perceived  to  be  typical  to 
those  requirements.  In  contrast  it  is  possible  to  embed  the  software  development  and  testing  process  within  a 
formal  statistical  design.  In  such  a  design,  software  testing  can  be  used  to  make  statistical  inferences  about  the 
reliability  of  the  future  operation  of  the  software.  This  paper  describes  a  procedure  for  certifying  the  reliability  of 
software  before  its  release  to  users.  The  ingredients  of  this  procedure  are  a  life  cycle  of  executable  product  incre¬ 
ments,  representative  statistical  testing,  and  a  standard  estimate  of  the  MTTF  (mean  time  to  failure)  of  the  pro¬ 
duct  at  the  time  of  its  release. 

[Curt79a]  Abstract:  This  experiment  is  the  third  in  a  series  investigating  characteristics  of  software  which  are 
related  to  its  psychological  complexity.  A  major  focus  of  this  research  has  been  to  validate  the  use  of  software 
complexity  metrics  for  predicting  programmer  performance.  In  this  experiment  we  improved  experimental  pro¬ 
cedures  which  produced  only  modest  results  in  the  previous  two  studies.  The  experimental  task  required  54 
experienced  Fortran  programmers  to  locate  a  single  bug  in  each  of  three  programs.  Performance  was  measured 
by  the  time  to  locate  and  successfully  correct  the  bug.  Much  stronger  results  were  obtained  than  in  earlier  stu¬ 
dies.  Halstead’s  E  proved  to  be  the  best  predictor  of  performance,  followed  by  McCabe’s  V(G)  and  the  number 
of  lines  of  code. 

[CurtiH]  Abstract:  Three  software  complexity  measures  (Halstead’s  E,  McCabe’s  V(G),  and  the  length  as  meas¬ 
ured  by  number  of  statements)  were  compared  to  programmer  performance  on  two  software  maintenance  tasks. 
In  an  experiment  on  understanding,  length  and  V(G)  correlated  with  the  percent  of  statements  correctly 
recalled.  In  an  experiment  on  modification,  most  significant  correlations  were  obtained  with  metrics  computed 
on  modified  rather  than  unmodified  code.  All  three  metrics  correlated  with  both  the  accuracy  of  the  modification 
and  the  time  to  completion.  Relationships  in  both  experiments  occurred  primarily  in  unstructured  rather  than 
structured  code,  and  in  code  with  no  comments.  The  metrics  were  also  most  predictive  of  performance  for  less 
experienced  programmers.  Thus,  these  metrics  appear  to  assess  psychological  complexity  primarily  where  pro¬ 
gramming  practices  do  not  provide  assistance  in  understanding  the  code. 

[DACS79b]  Preface:  The  purpose  of  this  document  is  to  record,  as  accurately  as  is  possible  in  a  still-evolving 
discipline,  the  terminology  currently  being  used  in  the  field  of  software  engineering.  We  hope  that  the  DACS 
GLOSSARY  will  help  to  improve  communication  within  the  software  engineering  community  and  will  also  pro¬ 
vide  an  impetus  toward  the  sorely  needed  standardization  of  terminology. 

This  software  engineering  glossary  is  one  of  the  products  of  the  Data  and  Analysis  Center  for  Software 
(DACS).  The  DACS  will  continue  to  update  this  glossary  to  reflect  current  term  usage.  Suggestions,  comments, 
and  critiques  are  welcome. 

[DOD88a]  Forward: 

1.  This  standard  establishes  uniform  requirements  for  software  development  that  are  applicable  throughout 
the  system  life  cycle.  The  requirements  of  this  standard  provide  the  basis  for  Government  insight  into  a 
contractor’s  software  development,  testing,  and  evaluation  efforts. 

2.  This  standard  is  not  intended  to  specify  or  discourage  the  use  of  any  particular  software  development 
method.  The  contractor  is  responsible  for  selecting  software  development  methods  (for  example,  rapid 
prototyping)  that  best  support  the  achievement  of  contract  requirements. 

3.  This  standard,  together  with  the  other  DOD  and  military  documents  referenced  in  Section  2,  provides  the 
means  for  establishing,  evaluating,  and  maintaining  quality  in  software  and  associated  documentation. 

4.  Data  Item  Descriptions  (DIDs)  applicable  to  this  standard  are  listed  in  Section  6.  These  DIDs  describe  a 
set  of  documents  for  recording  the  information  required  by  this  standard.  Production  of  deliverable  data 
using  automated  techniques  is  encouraged. 
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5.  Per  DODD  5000.43,  Acquisition  Streamlining,  this  standard  must  be  appropriately  tailored  by  the  program 
manager  to  ensure  that  only  cost-effective  requirements  are  cited  in  defense  solicitations  and  contracts. 
Tailoring  guidance  can  be  found  in  DOD-HDBK-248,  Guide  for  Application  and  Tailoring  of  Require¬ 
ments  for  Defense  Material  Acquisitions. 

[DODD86a]  Abstract:  The  purpose  of  this  Manual  is  to  describe  the  format,  content,  and  submission  pro¬ 
cedures  for  Test  and  Evaluation  Master  Plans  (TEMPS)  of  major  defense  acquisition  programs.  The  TEMP  is 
the  basic  planning  document  for  all  test  and  evaluation  (T&E)  related  to  a  particular  system  acquisition  and  is 
used  by  OSD  and  all  DoD  Components  in  planning,  reviewing,  and  approving  T&E.  The  TEMP  provides  the 
basis  for  all  other  detailed  T&E  planning  documents  and  serves  as  an  essential  element  of  the  Joint  Resources 
Management  Board  (JRMB)  decision-making  process  outlined  in  DoD  Instruction  5000.2,  “Major  System 
Acquisition  Procedures.’’ 

[DODD86b]  Purpose:  This  Manual  describes  the  procedures  to  be  followed  in  preparing  Test  and  Evaluation 
Master  Plans  (TEMPs)  for  major  defense  acquisition  programs,  including  those  major  (to  include  OSD  desig¬ 
nated)  programs  that  are  exempted  from  the  Joint  Resources  Management  Board  (JRMB)  process  or  when 
JRMB  authority  has  been  delegated  to  the  DoD  Components. 

[DODD87]  Overview:  This  Manual  describes  procedures  for  preparing  the  software  portion  of  the  Test  and 
Evaluation  Master  Plan  (TEMP).  The  Manual’s  objective  is  to  establish  a  disciplined  framework  that  will  result 
in  software  testing  that  is  methodically  planned,  results  oriented,  and  designed  to  produce  meaningful  evalua¬ 
tions. 

[DODS86]  Abbreviateu  r  orward:  This  standard  contains  requirements  for  the  development,  documentation, 
and  implementation  cr  \  software  quality  program.  This  program  includes  planning  for  and  conducting  evalua¬ 
tions  of  the  quality’  ot  software,  associated  documentation,  and  related  activities,  and  planning  for  and  conduct¬ 
ing  the  follow-up  activities  necessary  to  assure  timely  and  effective  resolution  of  problems. 

This  standard,  together  with  other  DOD  and  military  specifications  and  standards  governing  software 
development,  configuration  management,  specification  practices,  project  reviews  and  audits,  and  subcontractor 
management,  provide  a  means  for  achieving,  determining,  and  maintaining  quality  in  software  and  associated 
documentation.  This  standard  incorporates  the  applicable  requirements  of  MIL-STD-1520  and  MIL-STD-1535. 

This  standard  implements  the  policies  of  DODD  4155.1,  Quality  Program,  and  provides  all  of  the  neces¬ 
sary  elements  of  a  comprehensive  quality  program  applicable  to  software  development  and  support.  This  stan¬ 
dard  interprets  the  requirements  of  MIL-Q-9858,  Quality  Program  Requirements,  for  software  and  is  to  be  used 
in  conjunction  with  MIL-Q-9858  for  system  development  and  support  projects. 

[Dahi72]  Table  of  Contents:  Notes  on  structured  programming,  correctness  of  proofs,  validity  of  proofs.  Notes 
on  data  structuring,  the  concept  of  type,  unstructured  data  types,  recursive  data  structures,  axiomatisation, 
references.  Hierarchical  program  structures,  object  classes,  coroutines,  list  structures,  program  concatenation, 
concept  hierarchies,  references. 

[Daly77]  Abstract:  This  paper  describes  four  major  aspects  of  software  management:  development  statistics, 
development  process,  development  objectives,  and  software  maintenance.  The  control  of  both  large  and  small 
software  projects  is  included  in  the  analysis. 

[Darr78]  Abstract:  Symbolic  execution  provides  a  basis  for  a  program  analysis  tool  that  allows  one  to  choose 
intermediate  points  in  a  spectrum  ranging  between  individual  test  runs  and  general  correctness  proofs.  One  can 
perform  a  single  “symbolic  execution”  of  a  program  that  is  equivalent  to  a  large  (possibly  unbounded)  number  of 
normal  test  runs.  Not  only  can  test  results  be  checked  by  careful  manual  inspection,  but  if  a  machine  interpret¬ 
able  specification  is  supplied,  the  results  can  be  checked  automatically.  Furthermore,  by  varying  the  amount  of 
symbolic  data  and  program  specification  introduced,  one  can  move  from  a  normal  execution  (no  symbolic  data) 
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to  a  symbolic  execution  that  provides  a  proof  of  correctness. 

(Davi77]  Abstract:  The  paper  considers  informally  the  relationship  between  computer  aided  mathematical 
proof,  formal  algebraic  languages,  computation  with  transcendental  numbers,  and  proof  by  sampling. 

[Dari82b]  Abbreviated  Introduction:  There  are  at  least  four  phases  in  the  development  of  “correct”  software: 

—  Understanding  the  problem.  The  program  designer  may  work  with  intended  users  of  the  system  to  develop  an 
intuitive  understanding  of  the  problem  and  possible  approaches  to  its  solution. 

—  Formal  specification.  Once  the  designer  knows  intuitively  how  to  solve  the  problem,  the  solution  must  be 
specified  unambiguously. 

—  Programming.  An  implementation  of  the  specification  is  programmed. 

—  Verification.  The  implementation  developed  in  step  three  is  shown  to  satisfy  the  specification  of  step  two. 
There  is  a  certain  amount  of  testing  and  debugging  that  goes  on  at  each  of  these  stages  until  one  is  satisfied  with 
the  current  step  and  moves  on  to  the  next.  Several  verification  techniques  have  been  developed  to  assist  in 
accomplishing  step  four.  However,  even  after  a  proof  is  completed  we  cannot  claim  to  have  a  “correct”  program, 
only  one  that  satisfies  the  given  specification. 

How  does  one  “debug”  a  specification?  We  cannot  hope  to  formally  prove  that  a  specification  is  “correct” 
with  respect  to  our  intuition,  but  we  can  at  least  test  it  to  see  that  it  conforms  to  our  intuition  in  specific  cases. 

[Davi83a]  Preface:  Theoretical  computer  science  is  the  mathematical  study  of  models  of  computation.  As  such, 
it  originated  in  the  1930s,  well  before  the  existence  of  modern  computers,  in  the  work  of  the  logicians  Church, 
Go  -el,  Kieene,  Post,  and  Turing.  This  early  work  has  had  a  profound  influence  on  the  practical  and  theoretical 
development  of  computer  science.  Not  only  has  the  Turing-machine  model  proved  basic  for  theory,  but  the  work 
of  these  pioneers  presaged  many  aspects  of  computational  practice  that  are  now  commonplace  and  whose  intel¬ 
lectual  antecedents  are  typically  unknown  to  users.  Included  among  these  are  the  existence  in  principle  of  all¬ 
purpose  (or  universal)  digital  computers,  the  concept  of  a  program  as  a  list  of  instructions  in  a  formal  language, 
the  possibility  of  interpretive  programs,  the  duality  between  software  and  hardware,  and  the  representation  of 
languages  by  formal  structures  based  on  productions.  While  the  spotlight  in  computer  science  has  tended  to  fall 
on  the  truly  breathtaking  technological  advances  that  have  been  taking  place,  important  work  in  the  foundations 
of  the  subject  has  continued  as  well.  It  is  our  purpose  in  writing  this  book  to  provide  an  introduction  to  the  vari¬ 
ous  aspects  of  theoretical  computer  science  for  undergraduate  and  graduate  students  that  is  sufficiently 
comprehensive  that  the  professional  literature  of  treatises  and  research  papers  will  become  accessible  to  our 
readers. 

We  are  dealing  with  a  very  young  field  that  is  still  finding  itself.  Computer  scientists  have  by  no  means 
been  unanimous  in  judging  which  parts  of  the  subject  will  turn  out  to  have  enduring  significance.  In  this  situation, 
fraught  with  peril  for  authors,  we  have  attempted  to  select  topics  that  have  already  achieved  a  polished  classic 
form,  and  that  we  believe  will  play  an  important  role  in  future  research. 

[Davi83bj  Introduction:  We  propose  a  definition  of  the  notion  of  adequacy  of  software  test  data  and  discuss  jus¬ 
tification,  difficulties,  and  properties  of  the  notion.  It  is  not  the  purpose  of  this  paper  to  suggest  a  definite  practi¬ 
cally  applicable  criterion  of  test  data  adequacy.  Rather  we  present  a  theoretical  analysis  which,  it  is  believed, 
gives  insight  into  such  questions  as: 

1 .  For  a  given  program,  what  points  must  belong  to  a  test  set  in  order  that  it  may  be  deemed  adequate? 

2.  For  a  given  program  how  many  points  must  belong  to  an  adequate  test  set? 

3.  What  kind  of  approximation  to  “correctness”  can  be  provided  by  the  knowledge  that  a  program  has  been 
“adequately”  tested? 

We  believe,  in  general,  that  an  adequacy  criterion  should  be  invoked  only  after  the  test  data  fails  to  expose 
errors.  Clearly,  as  long  as  there  is  an  element  of  the  test  set  on  which  the  program  does  not  agree  with  the  specifi¬ 
cation,  we  know  that  the  test  data  is  still  doing  its  job  and  that  testing  (and  subsequent  debugging)  must  continue. 
Once  the  program  does  agree  with  the  specification  on  all  elements  of  a  set  of  test  data,  we  must  decide  whether 
the  testing  phase  can  end,  and  hence  we  will  need  to  invoke  some  kind  of  adequacy  criterion. 
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[Davi88a]  Abstract:  A  study  of  the  predictive  value  of  a  variety  of  syntax-based  program  complexity  measures  is 
described.  Experimentation  with  variants  of  new  chunk-oriented  measures  showed  that  one  should  judiciously 
select  measurable  software  attributes  as  proper  indicators  of  what  one  wishes  to  predict,  rather  than  hoping  for  a 
single,  all  purpose  complexity  measure.  This  study  has  shown  that  it  is  possible  for  particular  complexity  meas¬ 
ures  or  other  factors  to  serve  as  good  predictors  of  some  properties  of  program  but  not  for  others.  For  example, 
a  good  predictor  of  construction  time  will  not  necessarily  correlate  well  with  the  number  of  error  occurrences. 

Halstead’s  effort  measure  (E)  was  found  to  be  a  better  predictor  than  the  other  two  nonchunk  measures 
we  evaluated:  McCabe’s  V(G)  and  lines  of  code,  but  at  least  one  chunk  measure  predicted  better  than  E  in  every 
case. 

[DeFr85]  Abstract:  This  work  deals  with  issues  of  interactive  debugging  for  the  concurrent  language  ECSP.  The 
debugger  matches  a  formal  specification  of  the  expected  behavior.  This  specification  can  be  given  at  different  lev¬ 
els  of  abstraction.  Control  is  returned  to  the  user  when  an  error  is  detected.  The  user  can  then  modify  the  flow  of 
the  computation  and/or  dynamically  change  the  specification  of  the  expected  behavior.  The  debugger  implemen¬ 
tation  is  based  on  program  transformation  techniques. 

[DeMI77]  Abbreviated  Introduction:  Until  very  recently,  research  in  software  reliability  was  divided  quite  neatly 
into  two  -  usually  warring  -  camps:  methodologies  with  a  mathematical  basis  and  methodologies  without  such  a 
basis.  In  the  former  view,  “reliability”  is  identified  with  “correctness”  and  the  principle  tool  has  been  formal  and 
informal  verification.  In  the  latter  view,  “reliability”  is  taken  to  mean  the  ability  to  meet  overall  functional  goals 
to  within  some  predefined  limits.  We  have  argued  that  the  latter  view  holds  a  great  deal  of  promise  for  further 
development  at  both  the  practical  and  analytical  levels.  Howden  proposes  a  first  step  in  this  direction  by  describ¬ 
ing  a  method  for  “testing”  a  certain  restricted  class  of  programs  whose  behavior  can  -  in  a  sense  Howden  makes 
precise  -  be  algebraicized.  In  this  way,  “testing”  a  program  is  reduced  to  an  equivalence  test,  the  major  com¬ 
ponents  of  which  become 

1.  a  combinatorial  identification  of  “equivalent”  structures; 

2.  an  algebraic  test 

/ 1  =f  2> 

where  /, ,  i  -  1,  2  is  a  multivariable  polynomial  (multinomial)  of  degree  specified  by  the  program  being  con¬ 
sidered. 

We  are  inspired  by  Rabin  and,  less  directly,  by  the  many  successes  of  Erdos  and  Spencer  to  attempt  a  pro¬ 
babilistic  solution  to  (ii).  Using  these  methods,  we  show  that  (ii)  can  be  tested  with  probability  of  error  e  with 
only  o((g())  evaluations  of  multinomials,  where  g  is  a  slowly  growing  function  of  only  e.  In  particular,  30  or  so 
evaluations  should  give  sufficiently  small  probability  of  error  for  most  practical  situations. 

[DeMI78]  Abstract:  In  many  cases  tests  of  a  program  that  uncover  simple  errors  are  also  effective  for  uncovering 
much  more  complex  errors.  This  so  called  coupling  effect  can  be  used  to  save  work  during  the  testing  process. 

[DeMI79a]  Abstract:  It  is  argued  that  formal  verifications  of  programs,  no  matter  how  obtained,  will  not  play  the 
same  key  role  in  the  development  of  computer  science  and  software  engineering  as  proofs  do  in  mathematics. 
Furthermore  the  absence  of  continuity,  the  inevitability  of  change,  and  the  complexity  of  specification  of  signifi¬ 
cantly  many  real  programs  make  the  formal  verification  process  difficult  to  justify  and  manage.  It  is  felt  that  ease 
of  formal  verification  should  not  dominate  program  language  design. 

[DeM!87a]  Abbreviated  Preface:  This  book  is  an  updated  and  edited  version  of  the  report  of  the  Software  Test 
and  Evaluation  Project  of  the  Secretary  of  Defense  (Research  and  Engineering).  The  primary  objective  of  STEP 
was  (and  remains)  the  development  of  improved  policy  and  guidance  for  the  use  by  the  U.S.  Department  of 
Defense  for  the  test  and  evaluation  of  computer  software  for  so-called  “mission-critical”  applications. 

[The  book  provides]  state-of-the-art  and  state-of-the-practice  overviews.  These  overviews  contain  brief 


August  9, 1989 


descriptions  of  major  test  methodologies,  catalogs  of  automated  tools  to  support  them,  essentially  exhaustive 
bibliographies,  case  studies  of  good  and  bad  examples  of  software  testing  and  exegeses  of  major  standards. 

[DeMI87b]  Abbreviated  Abstract:  The  Mothra  environment  is  an  integrated  set  of  tools  and  interfaces  that  sup¬ 
port  the  planning,  definition,  preparation,  execution,  analysis  and  evaluation  of  tests  of  software  systems.  The 
support  provided  by  Mothra  is  applicable  from  the  earliest  stages  of  software  design  and  development  through 
the  progressively  later  stages  of  system  integration,  acceptance  testing,  operation  and  maintenance.  Mothra  has 
been  designed  to  address  [various]  cost  concerns.  Two  primary  design  criteria,  in  particular,  are  significant  in  this 
regard.  First,  the  Mothra  interfaces-particularly  user  interfaces-are  high-bandwidth.  This  allows  us  to  present 
more  information  during  testing  and  retesting.  Coupled  with  proper  design  and  integration  with  familiar  displays, 
it  should  obviate  the  need  for  extensive  training  to  use  Mothra. 

Secondly,  the  overall  Mothra  architecture  imposes  no  a  priori  constraints  on  the  size  of  the  software  sys¬ 
tems  that  can  be  tested  in  the  environment.  The  practical  meaning  of  this  criterion  is  that  the  same  architecture  is 
able  to  service  programs  varying  in  size  from  individual  modules  of  less  than  102  source  lines  to  fully  integrated 
systems  of  more  than  10s  lines.  The  human  user-the  tester-is  able  to  apply  comparable  functions  across  a  familiar 
interface  as  the  software  being  tested  evolves  in  size  and  complexity  by  several  orders  of  magnitude.  In  fact,  the 
only  indicators  of  size  or  complexity  that  have  ties  to  the  Mothra  architecture  are  the  operating  system  cost 
penalties  and  performance  delays  inherent  in  manipulating  massive  objects.  All  other  costs  and  resource 
demands  are  under  the  direct  control  of  the  tester. 

An  important  mechanism  for  meeting  these  criteria  is  that  Mothra  is  reconfigurable,  allowing  the  integra¬ 
tion  of  user  and  system  tools  with  which  the  tester  may  already  be  familiar,  and  allowing  the  system  to  make  use 
of  different  underlying  hardware  architectures  or  different  capabilities.  We  address  this  in  Mothra  by  the  use  of 
thematic  tools  for  software  testing.  For  example,  programmers  in  modem  development  environments  interact 
increasing  with  an  array  of  very  powerful  source  language  debuggers.  Even  though  formal  testing  methodologies 
and  debugging  are  very  different  activities,  the  debugging  theme  can  be  used  as  a  metaphor  to  carry  the  tester 
from  tool  to  tool  as  the  software  being  tested  evolves. 

One  Mothra  system  has  been  constructed  using  the  AT&T  Bell  Labs  built  interactive  bitmap  display  ter¬ 
minal  running  under  the  control  of  a  UNIX3  window  manager  called  Layers.  The  host  environment  is  a  modestly 
configured  VAX  11/780  running  UNIX  4.3  BSD.  Another  version  has  been  implemented  on  VAX  stations  run¬ 
ning  Ultrix4 1.2  and  the  X  Window  System.  However,  the  architecture  of  Mothra  encourages  rehosting.  Further¬ 
more,  explicit  operations  allow  Mothra  processes  to  spawn  parallel  and  vectorized  processes  for  execution  by  a 
Cyber  205  (or  any  other  powerful  parallel  machine. 

[DeMI87c]  Abstract:  This  paper  presents  a  new  technique  for  automatically  generating  test  data.  The  method  is 
based  on  mutation  analysis  and  uses  constraints  to  specify  test  cases  designed  to  find  particular  types  of  errors.  A 
prototype  implementation  has  been  used  to  effectively  kill  mutants  in  a  mutation  system.  The  technique  also 
combines  the  capabilities  of  previous  test  data  generation  methods.  The  paper  includes  an  initial  set  of  con¬ 
straints  and  discusses  some  of  the  problems  that  must  be  solved  in  order  to  develop  a  complete  implementation 
of  the  technique. 

[DeMi87d]  Abstract:  Mothra  is  a  software  testing  environment  that  supports  mutation-based  testing  of  software 
systems.  Mothra  is  interactive;  it  provides  a  high-bandwidth  user  interface  to  make  software  testing  faster  and 
less  painful.  Mothra  currently  runs  on  a  variety  of  systems  under  4.3  BSD  UNIX,  UNIX  System  V,  and 
ULTRIX-32  1.2.  This  paper  begins  with  a  brief  introduction  to  mutation  analysis.  We  then  take  the  reader  on  a 
guided  tour  of  Mothra,  emphasizing  how  it  interacts  with  the  tester.  We  conclude  with  a  short  discussion  of 
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Mothra’s  internal  design. 

[DeMi88a]  Abstract:  Mothra  is  a  software  testing  environment  that  supports  mutation-based  testing  of  software 
systems.  Mothra  is  interactive;  it  provides  a  high-bandwidth  user  interface  to  make  software  testing  faster  and 
less  painful.  Mothra  currently  runs  on  a  variety  of  systems  under  4.3  BSD  UNIX,  UNIX  System  V,  and 
ULTRIX-32  1.2.  This  paper  begins  with  a  brief  introduction  to  mutation  analysis.  We  then  take  the  reader  on  a 
guided  tour  of  Mothra,  emphasizing  how  it  interacts  with  the  tester.  Then  we  present  with  a  short  discussion  of 
Mothra’s  internal  design.  Next,  we  discuss  some  major  problems  with  using  mutation  analysis  and  discuss  possi¬ 
ble  solutions.  We  conclude  by  presenting  a  solution  to  one  of  these  problem-a  new  method  of  automatically  gen¬ 
erating  mutation-adequate  test  data. 

[DeMi88b]  Abstract:  The  purpose  of  this  IDA  Paper  is  to  document  the  results  of  an  analysis  of  software  testing 
and  verification  technology  conducted  for  the  Ada  Joint  Program  Office  (AJPO)  and  the  Rome  Air  Develop¬ 
ment  Center  (RADC)  by  the  Institute  for  Defense  Analyses  (IDA).  The  Paper  presents  a  coordinated  strategy 
for  meeting  a  critical  technology  goal  of  the  U.S.  Department  of  Defense  -  the  development  of  computer 
software  for  these  systems  upon  which  the  Armed  Forces  can  rely  for  the  success  of  missions  with  extreme  and 
often  life  critical  requirements. 

[DeRc76]  Abstract:  We  distinguish  the  activity  of  writing  large  programs  from  that  of  writing  small  ones.  By 
large  programs  we  mean  systems  consisting  of  many  small  programs  (modules),  usually  written  by  different  peo¬ 
ple. 

We  need  languages  for  programming-in-the-smali,  i.e.,  languages  not  unlike  the  common  programming 
languages  of  today,  for  writing  modules.  We  also  need  a  “module  interconnection  language”  for  knitting  those 
modules  together  into  an  integrated  whole  and  for  providing  an  overview  that  formally  records  the  intent  of  the 
pTOgrammer(s)  and  that  can  be  checked  for  consistency  by  a  compiler. 

[Dela88]  Abstract:  Metrics  are  the  quantification  of  environmental  and  performance  factors  to  measure  the 
effectiveness  of  activities  in  the  areas  of  resources,  schedule,  quality,  and  risk.  Metrics  provide  both  a  prospec¬ 
tive  and  retrospective  measure  of  accomplishment.  Retrospective  data  provides  a  baseline  for  the  next  project. 
Prospective  data  support  forecasting,  planning,  and  control  of  on-going  activities.  The  latter  is  obviously  prefer¬ 
able. 

This  paper  summarizes  the  types  of  metrics  developed  during  the  foundation  phase  of  the  Army 
WWMCCS  Information  System  (AWIS),  and  the  methodology  applied  to  achieve  a  selected  subset  of  these 
metrics  during  full  scale  development,  which  starts  early  in  1988  and  is  expected  to  last  for  five  years. 

[Denn78]  Abstract:  Queueing  network  models  have  proved  to  be  cost  effective  tools  for  analyzing  modern  com¬ 
puter  systems.  This  tutorial  paper  presents  the  basic  results  using  the  operational  approach,  a  framework  which 
allows  the  analyst  to  test  whether  each  assumption  is  met  in  a  given  system.  The  early  sections  describe  the 
nature  of  queueing  network  models  and  their  applications  for  calculating  and  predicting  performance  quantities. 
The  basic  performance  quantities  -  such  as  utilizations,  mean  queue  lengths,  and  mean  response  times  -  are 
defined,  and  operational  relationships  among  them  are  derived.  Following  this,  the  concept  of  job  flow  balance  is 
introduced  and  use  to  study  asymptotic  throughputs  and  response  times.  The  concepts  of  state  transition  bal¬ 
ance,  one-step  behavior,  and  homogeneity  are  then  used  to  relate  the  proportions  of  time  that  each  system  state 
is  occupied  to  the  parameters  of  job  demand  and  to  device  characteristics.  Efficient  methods  for  computing  basic 
performance  quantities  are  also  described.  Finally  the  concept  of  decomposition  is  used  to  simplify  analyses  by 
replacing  subsystems  with  equivalent  devices.  All  concepts  are  illustrated  liberally  with  examples. 

[DiMa85]  Abstract:  This  symbolic  run-time  debugger  for  Ada  provides  facilities  for  observing  and  manipulating 
the  execution  of  a  monitored  program,  also  for  concurrent  aspects.  The  debugger  can  be  used  interactively,  and 
also  as  a  monitoring  program  to  control  the  application.  A  feature  of  this  project  is  the  use  of  relational  algebra 
for  defining  compiler  and  kernel  interfaces  and  for  handling  debugger  information.  The  implementation  is  based 
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on  an  Ada  task  to  interface  with  the  debugging  operator  and  a  set  of  user-defined  Ada  monitoring  tasks.  A  proto¬ 
type  of  the  debugger  was  completed  as  a  part  of  ART,  a  relational  translator  and  interpreter  for  Ada. 

[Dijk76a]  Tabl«  of  Contents:  Executional  abstractions.  The  role  of  programming  languages.  States  and  their 
characterization.  The  characterization  of  semantics.  The  semantic  characterization  of  a  programming  language. 
Two  theorems.  On  the  design  of  properly  terminating  constructs.  Euclid’s  algorithm  revised.  The  formal  treat¬ 
ment  of  some  small  examples.  On  nondeterminancy  being  bounded.  An  essay  on  the  notion:  “the  scope  of  vari¬ 
ables.”  Array  variables.  The  linear  search  theorem.  The  problem  of  the  next  permutation.  The  problem  of  the 
Dutch  national  flag.  Updating  a  sequential  file.  Merging  problems  revisited.  An  exercise  attributed  to  R.W. 
Hamming.  The  pattern  matching  problem.  Writing  a  number  as  the  sum  of  two  squares.  The  problem  of  the 
smallest  prime  factor  of  a  large  number.  The  problem  of  the  most  isolated  villages.  The  problem  of  the  shortest 
subspanning  tree.  Rem’s  algorithm  for  recording  of  equivalence  classes.  The  problem  of  convex  Hull  in  three 
dimensions.  Finding  the  maximal  strong  components  in  a  directed  graph.  On  manuals  and  implementations.  In 
retrospect. 

[Dijk76b]  Abbreviated  Introduction:  Reviewing  recent  experiences  gained  during  the  design  and  construction  of 
a  multiprogramming  system  [the  author]  finds  [himself]  tom  between  two  apparently  conflicting  conclusions. 
Confining  [himself]  to  the  difficulties  more  or  less  mastered  [the  author]  feels  that  such  a  job  is  (or  at  least  should 
be)  rather  easy;  turning  [the  authors]  attention  to  the  remaining  problems  such  a  job  strikes  [the  author]  as  cru¬ 
elly  difficult.  The  difficulties  that  have  been  overcome  reasonably  well  are  related  to  the  reliability  and  the  produ- 
cibility  of  the  system,  the  unsolved  problems  are  related  to  the  sequencing  of  the  decisions  in  the  design  process 
itself. 

[The  author]  shall  mainly  describe  where  we  feel  that  we  have  been  successful.  This  choice  has  not  been 
motivated  by  reasons  of  advertisement  for  one’s  own  achievements;  it  is  more  that  a  good  knowledge  of  what-and 
what  little !-we  can  do  successfully,  seems  a  safe  starting  point  for  further  efforts,  safer  at  least  than  starting  with  a 
long  list  of  requirements  without  a  careful  analysis  whether  these  requirements  are  compatible  with  each  other. 

[Dill88a]  Abstract:  There  have  been  several  efforts  to  use  symbolic  execution  to  test  and  analyze  concurrent  pro¬ 
grams.  Recently  proof  systems  have  also  emerged  for  concurrent  programs  and  for  the  Ada  language  in  particu¬ 
lar.  This  paper  reports  on  an  experience  with  developing  two  different  approaches,  which  use  symbolic  execu¬ 
tion,  to  prove  partial  correctness  and  general  safety  properties  of  Ada  programs.  One  approach  is  based  upon 
interleaving  the  task  components  while  the  other  is  based  upon  verifying  the  tasks  in  isolation  and  then  perform¬ 
ing  cooperation  proofs.  Both  approaches  extend  past  efforts  by  incorporating  tasking  proof  rules  into  the  sym¬ 
bolic  executor  allowing  Ada  programs  with  tasking  to  be  formally  verified. 

The  limitations  of  each  approach  are  presented,  along  with  each  approach’s  advantages  and  disadvan¬ 
tages.  In  particular,  the  difficulty  of  dealing  with  communication  statements  in  a  loop  structure  are  addressed  in 
detail. 

[DU188b]  Abstract:  Symbolic  execution  has  been  used  successfully  with  sequential  programs  for  generating  the 
verification  conditions  required  for  correctness  proofs.  This  paper  shows  how  the  symbolic  execution  model  for 
sequential  programs  can  be  extended  to  a  tasking  subset  of  Ada.  The  criteria  for  correct  operation  of  a  con¬ 
current  program  include  safety  properties,  such  as  mutual  exclusion  and  freedom  from  deadlock.  The  extended 
model,  therefore,  provides  a  basis  for  the  automatic  generation  of  verification  conditions  for  proving  general 
safety  properties  of  Ada  tasking  programs. 

[DU188c]  Abstract:  An  approach  to  the  design  of  concurrent  software  systems  based  on  the  constrained  expres¬ 
sion  formalism  is  described.  This  formalism  provides  a  rigorous  conceptual  model  for  the  semantics  of  con¬ 
current  computations,  thereby  supporting  analysis  of  important  system  properties  as  part  of  the  design  process. 
This  approach  allows  designers  to  use  standard  specification  and  design  languages,  rather  than  forcing  them  to 
deal  with  the  benefits  of  formal  rigor  without  the  associated  pain  of  unnatural  concepts  or  notations  for  its  users. 
The  conceptual  model  of  concurrency  underlying  the  constrained  expression  formalism  treats  the  collection  of 
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possible  behaviors  of  a  concurrent  system  as  a  set  of  sequences  of  events.  The  constrained  expression  formalism 
provides  a  useful  closed-form  description  of  these  sequences.  Algorithms  were  developed  for  translating  designs 
expressed  in  a  wide  variety  of  notations  into  these  constrained  expression  descriptions.  A  number  of  powerful 
analysis  techniques  that  can  be  applied  to  these  descriptions  have  also  been  developed. 

[Doer85]  Abstract:  This  paper  describes  research  conducted  by  the  Software  Engineering  Laboratory  (SEL)  on 
the  use  of  dynamic  variables  as  a  tool  to  monitor  software  development.  The  intent  of  the  project  is  to  identify 
project  independent  measures  which  may  be  used  in  a  management  tool  for  monitoring  software  development. 
This  study  examines  several  Fortran  projects  with  similar  profiles.  The  staff  was  experienced  in  developing  these 
types  of  projects.  The  projects  developed  serve  similar  functions.  Because  these  projects  are  similar  we  believe 
some  underlying  relationships  exist  that  are  invariant  between  the  projects.  These  relationships,  once  well 
defined,  may  be  used  to  compare  the  development  of  different  projects  to  determine  whether  they  are  evolving 
the  same  way  previous  projects  in  this  environment  evolved. 

[Down85a]  Abstract:  In  this  paper,  an  approach  to  the  modeling  of  software  testing  is  described.  A  major  aim 
of  this  approach  is  to  allow  the  assessment  of  the  effects  of  different  testing  (and  debugging)  strategies  in  dif¬ 
ferent  situations.  It  is  shown  how  the  techniques  developed  can  be  used  to  estimate,  prior  to  the  commencement 
of  testing,  the  optimum  allocation  of  test  effort  for  software  which  is  to  be  nonuniformly  executed  in  its  opera¬ 
tional  phase.  In  addition,  the  question  of  application  of  statistical  models  in  cases  where  the  data  environment 
undergoes  changes  is  discussed.  Finally,  two  models  are  presented  for  the  assessment  of  the  effects  of  imperfec¬ 
tions  in  the  debugging  process. 

[Down86]  Abstract:  This  paper  shows  how  a  major  -(and  questionable)  assumption  underlying  a  previously 
reported  approach  to  the  modeling  of  software  testing  can  be  relaxed  in  order  to  provide  a  more  realistic  model. 
Under  the  assumption  of  uniform  execution  the  new  model  is  found  to  perform  only  marginally  better  than  the 
previous  model,  indicating  that  the  uniform  execution  assumption  is  a  poor  one.  A  nonuniform  execution  model, 
also  developed  in  the  paper,  is  then  shown  to  give  very  good  performance  on  application  to  three  sets  of  software 
reliability  data.  The  results  obtained  point  the  way  to  further  developments  which  are  likely  to  lead  to  models 
whose  performance  is  superior  to  that  of  the  nonuniform  execution  model  presented  here.  The  paper  also 
devotes  some  attention  to  the  problem  of  comparison  of  performance  of  different  models  and  points  out  some  j 
difficulties  in  this  area. 

[Drap66]  Abbreviated  Preface:  We  have  tried  to  bring  together  in  this  book  a  number  of  procedures  developed 
for  regression  problems  in  current  use.  Since  our  emphasis  is  on  practical  application,  we  have  stated  theoretical  J 
results  without  proofs  in  many  cases.  While  the  text  can  be  used  without  any  computing  equipment  at  all  (or 
perhaps  with  only  a  desk  calculator),  we  have  made  use  of  computer  printouts  in  some  parts  of  the  book.  We 
have  also  provided  various  exercises,  some  of  which  can  be  solved  easily  “by  hand,’’  and  other  more  extensive 
ones  for  which  use  of  an  electronic  computer  would  be  helpful,  though  not  absolutely  essential. 

This  book  provides  a  standard,  basic  course  in  multiple  linear  regression,  but  it  also  includes  material  that 
either  has  not  previously  appeared  in  a  textbook  or,  if  it  has  appeared,  is  not  generally  available.  For  example, 
Chapter  3  discusses  the  examination  of  residuals;  Chapter  6  examines  the  methods  employed  as  selection  pro¬ 
cedures  in  various  types  of  regression  programs;  Chapter  8  discusses  the  planning  of  large  regression  studies;  and 
Chapter  10  provides  a  basic  introduction  to  the  theory  of  nonlinear  estimation. 

[Duke89]  Abbreviated  Introduction:  Verifying  and  validating  flight  and  mission-critical  systems  is  a  major 
activity  at  the  Dryden  Flight  Research  Facility  of  the  National  Aeronautics  and  Space  Administration’s  Ames 
Research  Center.  The  Ames-Dryden  staff  is  responsible  for  flight  safety  for  all  vehicles  flown  at  the  Dryden  facil¬ 
ity,  which  is  located  in  the  desert  north  of  Los  Angeles.  Because  these  systems  are  used  in  research  aircraft,  the 
V&V  experience  at  Ames-Dryden  is  primarily  with  one-of-a-kind  research  systems  on  experimental  vehicles. 

The  Ames-Dryden  V&V  methodology  relies  on  testing,  peer  review,  abstract  models,  simulations,  and 
validation  by  actual  flight.  This  methodology  also  relies,  in  a  large  part,  on  engineering  judgement  and  a  tradition 
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that  has  evolved  from  experience  with  flight-critical  systems  that  include  the  digital  flight-control  system  on  the 
F-8,  the  three-eights-scale  remotely  piloted  F-15,  the  highly  maneuverable  Himat,  the  Advanced-Fighter  Tech¬ 
nology  Integration  F-16,  and  the  X-29  forward-swept-wing  aircraft. 

[Donc81]  Abstract:  We  present  a  method  for  generating  test  cases  that  can  be  used  throughout  the  entire  life 
cycle  of  a  program.  This  method  uses  attributed  translation  grammars  to  generate  both  inputs  and  outputs,  which 
can  then  be  used  either  as  is,  in  order  to  test  the  specifications,  or  in  conjunction  with  automatic  test  drivers  to 
test  an  implementation  against  the  specifications. 

The  grammar  can  generate  test  cases  either  randomly  or  systematically.  The  attributes  are  used  to  guide 
the  generation  process,  thereby  avoiding  the  generation  of  many  superfluous  test  cases.  The  grammar  itself  not 
only  drives  the  generation  of  test  cases  but  also  serves  as  a  concise  documentation  of  the  test  plan. 

In  this  paper,  we  describe  the  test  case  generator,  show  how  it  works  in  typical  examples,  compare  it  with 
related  techniques,  and  discuss  how  it  can  be  used  in  conjunction  with  various  testing  heuristics. 

[Dunh83]  Abbreviated  Introduction:  Software  engineering  can  attain  the  status  of  scientific  discipline  only  if  it  is 
built  upon  a  solid  foundation  of  objective  measurement.  In  fact,  its  maturity  as  a  discipline  will  be  reflected  in  the 
degree  to  which  measurement  becomes  a  normal  part  of  the  software  development  and  maintenance  process. 

Measurement  is  a  difficult  area  to  discuss  as  an  isolated  topic  because  it  is  fundamental  to  virtually  all 
aspects  of  software  engineering  and  management.  But  this  is  precisely  what  makes  it  so  important.  In  one  article, 
we  cannot  hope  to  cover  all  possible  uses  for  measurement  or  all  possible  types  of  measures  and  models.  We  can, 
however,  provide  a  framework  for  discussion. 

Measurement  from  the  perspective  of  those  involved  in  software  development  and  maintenance  has  prac¬ 
tical  benefits  as  a  management,  development,  and  contractual  tool.  From  the  scientist’s  perspective,  it  is  useful 
in  the  development  of  quantitative  models.  This  article  reviews  measurement  and  modeling  activities:  resource 
expenditures,  software  and  system  reliability,  system  performance,  and  user  performance.  It  then  describes  the 
measurement  activities  in  the  STARS  program,  which  are  designed  to  further  advance  the  technology  of  meas¬ 
urement  and  to  facilitate  its  widespread  use. 

[Dunh86]  Digital  computers  are  being  used  more  frequently  for  process  control  applications  in  which  the  cost  of 
system  failure  is  high.  Consideration  of  the  potentially  life-threatening  risk,  resulting  from  the  high  degree  of 
functionality  being  ascribed  to  the  software  components  of  these  systems,  has  stimulated  the  recommendation  of 
various  designs  for  tolerating  software  faults.  Such  designs  are  not  panaceas,  for  they  still  entail-as  did  the  fault 
intolerant  designs  they  are  superceding-an  unknown  probability  of  failure.  The  paper  discusses  four  reliability 
data  gathering  experiments  which  were  conducted  using  a  small  sample  of  programs  for  two  problems  having 
ultrareliability  requirements,  n-version  programming  for  fault  detection,  and  repetitive  run  modeling  for  failure 
and  fault  rate  estimation.  The  experimental  results  agree  with  those  of  Nagel  and  Skrivan  in  that  the  program 
error  rates  suggest  an  approximate  log-linear  pattern  and  the  individual  faults  occurred  with  significantly  different 
error  rates.  Additional  analysis  of  the  experimental  data  raises  new  questions  concerning  the  phenomenon  of 
interacting  faults.  This  phenomenon  may  provide  one  explanation  for  software  reliability  decay.  The  fourth 
experiment  underscored  the  difficulty  in  distinguishing  between  observations  of  deficiencies  in  the  design  of  the 
algorithm  antf  observations  of  software  faults  for  real-time  process  control  software.  These  experiments  are  a 
part  of  a  program  of  serial  experiments  being  pursued  by  the  System  Validation  Methods  of  NASA-Langley 
Research  Center  to  find  a  means  of  credibly  performing  reliability  evaluations  of  flight  control  software. 

[Dunn74]  Abbreviated  Preface:  This  book  has  been  based  on  notes  originally  developed  for  a  one-semester 
course  in  analysis  of  variance,  regression,  and  covariance.  We  had  two  general  objectives  in  [it’s  development]. 
[To  prepare]  a  self-contained  textbook,  [and  one]  that  might  be  useful  as  a  reference. 

Chapters  1  through  4  provide  the  introduction  necessary  for  studying  analysis  of  variance  and  regression. 
Chapters  5,6,  and  7  deal  with  the  fixed  effects  model  analysis  of  variance  Model  I.  Chapter  8  gives  a  brief  intro¬ 
duction  to  confounding,  still  with  Model  I.  Chapter  9  presents  variable  effects  models  (Models  II  and  III). 
Chapters  10,  11,  and  12  introduce  linear,  multiple,  and  polynomial  regression;  Chapter  13  is  concerned  with 
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covariance  analysis.  In  Chapter  14  we  discuss  various  techniques  for  screening  data  before  analysis.  We  consider 
this  material  so  important  that  it  must  be  gathered  together  into  a  single  chapter  and  placed  as  conspicuously  as 
possible;  clearly  it  cannot  be  the  first  chapter,  and  so  it  must  be  the  last. 

[Dtuui82]  Abbreviated  Introduction:  As  we  see  it  —  and  it  is  heartening  to  note  that  this  is  becoming  the  prevail¬ 
ing  view  —  software  quality  assurance  is  the  mapping  of  the  managerial  precepts  and  design  disciplines  of  quality 
assurance  onto  the  applicable  management  and  technological  space  of  software  engineering.  In  the  transfer,  fam¬ 
iliar  quality  assurance  approaches  to  improving  control  and  performance  metamorphose  into  techniques  and 
tools  different  from  those  to  which  the  quality  community  is  accustomed.  For  its  part,  software  approaches  to 
the  production  and  maintenance  of  computer  software  are  given  new  form  as  well  as  procedural  efficiency.  Yet 
both  communities,  to  their  mutual  advantage,  can  easily  relate  to  this  concept  of  software  quality  assurance. 

Software  quality  assurance  can  be  constructive,  can  avoid  being  a  bureaucratic  impediment,  only  by  draw¬ 
ing  upon  fundamental  concepts  of  both  software  engineering  and  quality  assurance.  It  is  the  resulting  amalgam 
which  we  set  out  to  describe  and  how  many  ingredients  can  be  seen  in  the  road  map  to  the  balance  of  the  book, 
which  is  provided  toward  the  end  of  Chapter  1. 

[Dunn84]  Table  of  Contents:  Introduction.  To  err  is  human;  to  find  the  bug,  divine,  an  overview  ot  development 
methodologies.  Static  methods.  Requirements  and  design  reviews,  code  reviews,  static  analysis,  proof  of 
correctness.  Dynamic  testing.  Matters  of  strategy,  glass  box  testing,  black  box  testing,  analysis  of  defect  and 
failure  data.  Operational  phase.  Configuration  control,  maintenance  and  modification. 

[Duns77]  Abstract:  One  measure  of  programming  complexity  is  the  number  of  “program  changes”  that  must  be 
made  from  the  initial  version  of  a  program  until  it  is  in  final  form.  A  count  of  errors  occurring  in  the  debugging 
process  is  an  accepted  measure  of  difficulty  in  programming.  Using  source  modules  from  an  experiment  involv¬ 
ing  thirty-three  subjects  developing  a  moderately  difficult  program,  it  has  been  demonstrated  that  “program 
changes”  correlates  well  with  the  count  of  errors.  In  addition,  subjects  whose  initial  version  of  a  program  had 
either  a  moderate  average  nesting  depth  and/or  a  moderate  usage  of  global  variables  made  fewer  program 
changes  during  development. 

[Duns78b]  Abstract:  Programming  complexity  (the  amount  of  difficulty  in  constructing  a  program)  may  depend 
upon  certain  programming  factors  (choices  of  programming  language  features).  Using  program  changes  as  a  pro¬ 
gramming  complexity  measure,  previous  research  has  identified  five  potential  programming  factors.  This  paper 
suggests  that  subjects  tend  to  use  the  same  levels  of  these  factors  in  two  different  programming  languages  sup¬ 
porting  the  conjecture  that  these  factors  are  elements  of  individual  programming  style.  It  also  describes  five 
potential  programming  factors,  and  although  each  has  intuitive  appeal,  only  average  procedure  length  was 
related  to  programming  complexity. 

[Dura78]  Abstract:  Program  testing  remains  the  major  way  in  which  program  designers  convince  themselves  of 
the  validity  of  their  programs.  Software  reliability  measures  based  on  hardware  reliability  concepts  have  been 
proposed,  but  adequate  models  of  software  reliability  have  not  yet  been  developed.  Investigators  have  recently 
studied  formal  program  testing  concepts,  with  promising  results,  but  have  not  seriously  considered  quantitative 
measures  of  the  “degree  of  correctness”  of  a  program.  We  present  models  for  determining,  via  testing,  such  pro¬ 
babilistic  measures  of  program  correctness  as  the  probability  that  a  program  will  run  correctly  on  randomly 
chosen  input  data,  confidence  intervals  on  the  number  of  errors  remaining  in  a  program,  and  the  probability  that 
the  program  has  been  completely  tested.  We  also  introduce  a  procedure  for  enhancing  correctness  estimates  by 
quantifying  the  error  reducing  performance  of  the  methods  used  to  develop  and  debug  a  program. 

[DuraSO]  Abstract:  The  point  of  all  validation  techniques  is  to  raise  assurance  about  the  program  under  study, 
but  no  current  methods  can  be  realistically  thought  to  give  100%  assurance  that  a  validated  program  will  perform 
correctly.  There  are  currently  no  useful  ways  for  quantifying  how  “well-validated”  a  program  is.  One  measure  of 
program  correctness  is  the  proportion  of  elements  in  the  program’s  input  domain  for  which  it  fails  to  execute 
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correctly,  since  the  proportion  is  zero  i.f.f.  the  program  is  correct.  This  proportion  can  be  estimated  statistically 
from  the  results  of  program  tests  and  from  prior  subjective  assessments  of  the  program’s  correctness.  Three 
examples  are  presented  of  methods  for  determining  s-confidence  bounds  on  the  failure  proportion.  It  is  shown 
that  there  are  reasonable  conditions  (for  programs  with  a  finite  number  of  paths)  for  which  ensuring  the  testing 
of  all  paths  does  not  give  better  assurance  of  program  correctness. 

[DuraSla]  Abstract:  Random  testing  of  programs  is  usually  (but  not  always)  viewed  as  a  worst  case  of  program 
testing.  Test  case  generation  that  takes  into  account  the  program  structure  is  usually  preferred.  Path  testing  is  an 
often  proposed  ideal  for  structural  testing.  Path  testing  is  treated  here  as  an  instance  of  partition  testing,  where 
by  partition  testing  is  meant  any  testing  scheme  which  forces  execution  of  at  least  one  test  case  from  each  subset 
of  a  partition  of  the  input  domain.  Simulation  results  are  presented  which  suggest  that  random  testing  may  often 
be  more  cost  effective  than  partition  testing  schemes.  Also,  results  of  actual  random  testing  experiments  are 
presented  which  confirm  the  viability  of  random  testing  as  a  useful  validation  tool. 

[Dura81b]  Abstract:  Mill’s  capture-recapture  sampling  method  allows  the  estimation  of  the  number  of  errors  in 
a  program  by  randomly  inserting  known  errors  and  then  testing  the  program  for  both  inserted  and  indigenous 
errors.  This  correspondence  shows  how  correct  confidence  limits  and  maximum  likelihood  estimates  can  be 
obtained  from  the  test  results.  Both  fixed  sample  size  testing  and  sequential  testing  are  considered. 

[Duva80]  Abstract:  This  paper  summarizes  the  results  of  a  study  to  determine  the  data  requirements  for 
software  reliability  modeling.  The  major  assumptions  of  the  models  are  presented  along  with  a  brief  description 
of  their  uses  and  the  data  needed  to  exercise  the  models.  Methodologies  for  evaluating  failure  databases  are 
presented  including  a  sample  evaluation  to  determine  the  adequacy  of  the  data  to  do  comparisons  across  a  wide 
variety  of  projects  and  to  determine  if  the  database  contains  data  elements  as  required  by  the  various  models. 

[Dyer80]  Abbreviated  Introduction:  In  this  paper  on  software  development,  the  focus  is  on  the  blend  of  modem 
software  methods  with  established  development  practices.  Reducing  diversity,  increasing  visibility,  and  improv¬ 
ing  productivity  in  the  development  process  are  the  principal  means  of  intellectual  control  of  development. 
Improved  product  quality,  product  transportability,  and  product  adaptability  are  longer-range  goals. 

The  development  methodology  is  defined  in  terms  of  practices  that  recognize  the  increased  precision 
introduced  by  modem  design  methods  and  that  attempt  to  introduce  the  rigor  of  modem  design  into  the  methods 
of  software  product  development.  Code  management  practices  deal  with  the  implementation  of  software  and  the 
control  of  its  release  as  a  product.  Integration  engineering  practices  address  plans  for  building  software  pro¬ 
ducts. 

[Dyer85a]  Introduction:  Testing  to  confirm  that  the  implemented  software  (and  its  design)  satisfies  its  intended 
requirements  is  performed  by  someone  other  than  the  software  developer.  In  this  case,  black  box  or  function 
testing  is  performed,  not  to  verify  that  the  code  executes,  but,  more  importantly,  that  it  performs  its  intended 
job.  Independent  testers  are  disassociated  from  the  product  design  and  are  more  objective  in  verifying  that  a  pro¬ 
duct  operates  as  expected.  This  independent  testing  is  commonly  defined  as  the  software  verification  step  in  the 
software  life  cycle. 

This  chapter  discusses  an  approach  to  software  verification  which  downplays  the  current  error  detection 
focus  and  promotes  an  operational  testing  focus.  Functional  testing  from  an  operational  use  perspective  demon¬ 
strates  not  only  that  the  software  performs  its  job,  but  also  that  it  does  it  in  the  planned  user  environments.  The 
emphasis  on  user  perspective  should  help  ensure  the  development  of  executing  products  which  are  also  usable  in 
the  field  and  whose  field  reliability  (MTTF)  can  be  estimated  during  development. 

To  implement  this  approach,  a  statistical  testing  procedure  is  defined  for  function  verification.  The  pro¬ 
cedure  uses  the  probability  distributions  of  the  product  inputs  and  randomized  sampling  techniques  to  organize 
test  material.  The  randomization  supports  statistical  inferences  about  the  product’s  operational  characteristics 
and  an  estimation  of  its  expected  reliability  (MTTF). 
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[Eaat72]  Abstract:  A  method  is  presented  for  obtaining  system  confidence  limits  based  on  component  test 
results.  The  techniques  consists  of  estimating  the  asymptotic  variance  of  the  maximum  likelihood  estimate  of 
system  reliability,  equating  this  to  the  estimate  of  the  variance  in  binomial  sampling,  and  solving  for  n  and  x,  the 
pseudo-numbers  of  system  tests  and  successes.  These  are  then  substituted  into  the  incomplete  beta  function  and 
confidence  limits  obtained  in  the  usual  way  for  binomial  sampling. 

[Eckh85]  Summary:  Fundamental  to  the  development  of  redundant  software  techniques  (known  as  fault-tolerant 
software)  is  an  understanding  of  the  impact  of  multiple  joint  occurrences  of  errors,  referred  to  here  as  coincident 
errors.  A  theoretical  basis  for  the  study  of  redundant  software  is  developed  which  (1)  provides  a  probabilistic 
framework  for  empirically  evaluating  the  effectiveness  of  the  general  (N-version)  strategy  when  component  ver¬ 
sions  are  subject  to  coincident  errors,  and  (2)  permits  an  analytical  study  of  the  effects  of  these  errors.  The  basic 
assumptions  of  the  model  are:  (i)  independently  designed  software  components  are  chosen  in  a  random  sample 
and  (ii)  in  the  user  environment,  the  system  is  required  to  execute  on  a  stationary  input  series.  An  intensity  func¬ 
tion,  called  the  intensity  of  coincident  errors,  has  a  central  role  in  the  model.  This  function  describes  the  propen¬ 
sity  of  a  population  of  programmers  to  introduce  design  faults  in  such  a  way  that  software  components  fail 
together  when  executing  in  the  user  environment.  The  model  is  used  to  give  conditions  under  which  an  N-version 
system  is  a  better  strategy  for  reducing  system  failure  probability  than  relying  on  a  single  version  of  software.  In 
addition,  a  condition  which  limits  the  effectiveness  of  a  fault-tolerant  strategy  is  studied,  and  we  ask  whether  sys¬ 
tem  failure  probability  varies  monotonically  with  increasing  N  or  whether  an  optimal  choice  of  N  exists. 

[Ehre76]  Abstract:  The  number  of  tests,  which  are  necessary  to  prove  the  performance  of  a  program,  can  be 
reduced  to  an  executable  number,  if  the  structure  of  the  program  is  investigated.  The  analysis  starts  from  the 
memory  dump.  The  program  is  first  divided  into  those  pieces,  which  are  without  labels  or  branchings.  Then  the 
mappings  of  the  program  and  their  input  and  output  areas  are  identified,  further  those  areas  which  influence 
branchings.  The  next  step  states  which  ranges  of  values  in  the  individual  areas  are  distinguished  by  the  program 
and  which  junctions  of  areas  are  relevant.  From  this,  the  kind  and  the  number  of  the  necessary  tests  can  be 
derived.  By  means  of  observing  their  main  variables  loops  are  divided  into  simpler  structures 

The  method  has  been  applied  for  the  verification  of  the  user  programs  of  the  protection  system  of  the  800 
ME  boiling  water  reactor  plant  in  Brunsbuttel. 

[Ebrl87]  Abstract:  A  number  of  time-domain  software  reliability  models  attempt  to  predict  the  growth  of  a  sys¬ 
tem’s  reliability  during  the  system  test  phase  of  the  development  life  cycle.  In  this  paper  we  examine  the  results 
of  applying  several  types  of  Poisson-process  models  to  the  development  of  a  large  system  for  which  system  test 
was  performed  in  two  parallel  tracks,  using  different  strategies  for  test  data  selection.  We  show  that  the  reliability 
growth  predicted  by  non-homogeneous  Poisson  process  models  was  found  for  only  one  of  these  testing  stra¬ 
tegies.  These  results  imply  that  the  applicability  of  a  reliability  growth  model  to  a  given  software  development 
project  will  depend  on  the  nature  of  that  project’s  system  test  process;  they  also  raise  theoretical  questions  about 
the  assumption  of  certain  statistical  properties  for  failure  occurrence  during  testing. 

[Elme69]  Abstract:  Functional  testing  of  operating  systems  is  in  transition  from  a  predominantly  imprecise  art 
to  an  increasingly  precise  science.  The  process  that  controls  this  testing  is  maturing  correspondingly.  The  laissez- 
faire  approach  is  giving  way  to  a  disciplined  approach  characterized  by  rigorous  definition  of  the  test  plan,  sys¬ 
tematic  control  of  the  test  effort,  and  objective  quantitative  measurement  of  the  test  coverage.  This  paper 
describes  just  such  a  disciplined  test  control  process,  which  is  composed  of  five  steps:  1)  the  survey,  which  estab¬ 
lishes  the  intended  extent  of  testing;  2)  the  identification,  which  creates  a  list  of  functional  variations  eligible  for 
testing;  3)  the  appraisal,  which  ranks  and  subsets  the  eligible  variations  so  that  test  resources  can  be  directed  a 
those  with  the  higher  payoff;  (4)  the  review,  which  calculates  the  test  coverage  of  the  test  case  library;  and  5)  the 
monitor,  which  verifies  attainment  of  the  planned  testing  coverage.  Throughout  the  test  process,  specification 
testing  is  distinguished  from  program  testing. 

[Ebne71]  Abbreviated  Abstract:  This  paper  discusses  some  lessons  [the  author  has]  learned  from  testing  large, 
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complex  software  systems.  Actually,  [the  author]  believe  these  lessons  are  equally  applicable  to  small  software 
packages.  While  the  large  systems  are  admittedly  particularly  vulnerable  to  error,  the  small  systems  may  have 
many  more  users  and  thus  the  impact  of  error  can  be  equally  great. 

[The  author]  will  be  addressing  functional  testing  but  not  performance  testing.  Just  as  performance  testing 
involves  space  and  time  measures  of  system  utilization,  so  functional  testing  involves  spatial  and  temporal  meas¬ 
ures  of  system  quality. 

[Ehne73]  Abstract:  This  paper  describes  a  partially  automated  and  disciplined  process  for  testing  the 
equivalence  between  a  functional  specification  and  its  implementation  in  software,  firmware  or  hardware.  The 
semantic  content  of  a  specification  expressed  in  a  natural  language  is  restated  in  boolean  graph  form  as  a  logical 
relationship  between  causes  and  effects.  This  cause-effect  graph  is  processed  to  yield  (1)  a  list  of  the  functional 
primitives,  or  variations,  (2)  the  definition  of  a  syntactically  feasible  test  library  which  will  efficiently  validate 
these  variations,  and  (3)  the  faulty  variations  detectable  by  each  test.  The  correspondence  between  functional 
variations  and  decision  table  rules  is  also  addressed. 

[Elsh76b]  Abstract:  The  source  code  for  120  production  PL/I  programs  from  several  General  Motors’  commer¬ 
cial  computing  installations  has  been  collected.  The  programs  have  been  scanned  both  manually  and  automati¬ 
cally.  Some  data  from  the  scanning  process  are  presented  and  interpreted. 

The  programs  are  considered  with  respect  to  five  attributes:  1)  the  size  of  the  programs,  2)  the  readability 
of  the  programs,  3)  the  complexity  of  the  programs,  4)  the  discipline  followed  by  the  programmers,  and  5)  the 
use  of  the  programming  language.  Each  areas  is  reviewed  with  pertinent  data  presented  whenever  it  is  available. 

The  report  should  be  of  interest  to  anyone  involved  with  programming.  The  report  helps  explicitly  iden¬ 
tify  some  areas  of  programming  in  which  a  better  job  could  be  done.  Although  the  programs  analyzed  are  written 
in  PL/I,  those  persons  from  installations  using  other  languages,  particularly  Cobol,  have  indicated  that  the  infor¬ 
mation  presented  is  typical. 

[Elsh78c]  Abstract:  The  October  1988  issue  of  SIGPLAN  Notices  carries  an  article  that  compares  functionally 
equivalent  programs  that  differ  in  their  internal  structure.  The  basis  for  comparing  the  programs  is  a  measure 
called  cyclomatic  complexity  whose  value  is  the  cyclomatic  number  of  the  graph  that  corresponds  to  the  flow  of 
control  of  the  program.  One  program  is  of  particular  interest  since  all  of  the  well-structured  versions  of  the  pro¬ 
gram  that  are  discussed  have  a  higher  cyclomatic  complexity  than  the  unstructured  version.  In  this  paper  another 
well-structured  version  of  the  program  is  presented  for  which  the  cyclomatic  complexity  is  reduced  to  that  of  the 
original  unstructured  version.  In  the  process,  some  of  the  shortcomings  of  the  cyclomatic  number  as  a  complex¬ 
ity  measure  are  revealed. 

[Elsh84]  Abstract:  Twenty  program  complexity  measures  are  studied  with  respect  to  how  well  they  identify  the 
more  complex  procedures  in  a  software  system.  The  measures  have  been  applied  to  three  large  sets  of  PL/I  pro¬ 
cedures  representing  three  different  types  of  applications.  Four  of  these  complexity  measures  have  been  found  to 
form  a  characteristic  set.  That  is,  when  procedures  are  kept  within  reasonable  bounds  for  the  four  selected  meas¬ 
ures,  they  will  most  likely  be  within  reasonable  bounds  for  all  of  the  other  measures.  The  measures  and  their 
interpreted  meanings  are: 

•  length  -  the  quantity  of  source  code, 

•  unique  operators  -  the  variety  of  programming  language  actions, 

•  data  difficulty  -  the  average  number  of  variable  appearances,  and 

•  unique  operands  -  the  variety  of  constants  and  variables. 

[Elsp72a]  Abstract:  The  purpose  of  this  paper  is  to  point  out  the  significant  quantity  of  work  in  progress  on  tech¬ 
niques  that  will  enable  programmers  to  prove  their  programs  correct.  This  work  has  included:  investigations  in 
the  theory  of  program  schemas  or  abstract  programs;  development  of  the  art  of  the  informal  or  manual  proof  of 
correctness;  and  development  of  mechanical  or  semi-mechanical  approaches  to  proving  correctness.  At  present, 
these  mechanical  approaches  rely  upon  the  availability  of  powerful  theorem-provers,  development  of  which  is 
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being  actively  pursued.  All  of  these  technical  areas  are  here  surveyed  in  detail,  and  recommendations  are  made 
concerning  the  direction  of  future  research  toward  producing  a  semi-mechanical  program  verifier. 

[Emde81]  Abstract:  Our  goal  is  to  obtain  a  specification  of  a  relational  data  base  as  an  abstract  data  type  in  such 
a  way  that  a  computer  program  can  simulate  on  a  small  scale  the  inteded  use  of  the  data  base  by  generating  for¬ 
mal  consequences  of  the  specification  (that  is,  without  the  existence  of  any  implementation  of  the  data  base). 
There  are  two  candidates  for  the  specification  formalism  to  be  used:  equations  and  the  Horn  clauses  of  logic. 

Apart  from  a  specification  of  a  relational  data  base,  the  paper  is  devoted  entirely  to  a  comparison  between 
equations  and  clauses.  We  compare  three  aspects:  mathematical  semantics,  the  computational  aspects,  and 
expressiveness.  We  propose  to  discard  equations  as  a  distinct  formalism,  but  will  regard  them  as  a  special  case  of 
clauses.  In  principle  we  use  as  specification  a  clausal  sentence  containing  literally  the  equations  conventionally 
used  in  data  type  specification,  but  we  find  certain  slight  departures  conducive  to  clarity.  As  a  program  (to  be  exe¬ 
cuted  by  a  PROLOG  processor)  we  use  another  sentence  obtained  from  the  specification  by  a  translation  pro¬ 
cess  that  guarantees  correctness. 

[Emer84]  Abstract:  The  decomposition  of  a  large  program  into  modules  can  be  guided  by  the  use  of  a  property 
called  cohesion,  first  described  by  Constantine.  Cohesion  is  a  quality  that  describes  the  degree  to  which  the  dif¬ 
ferent  actions  performed  by  a  module  contribute  to  a  unified  function.  However,  this  technique  may  be  difficult 
to  apply  due  to  the  subjective  nature  of  the  definitions  of  levels  of  cohesion.  In  this  paper  a  software  metric  is 
defined  and  proposed  as  a  discriminant  for  classifying  modules  according  to  their  cohesion.  Formal  properties  of 
the  metric  are  derived  which  can  be  used  to  set  the  metric  value  ranges  for  module  classification. 

[Endr75]  Abstract:  Program  errors  detected  during  internal  testing  of  the  operating  system  DOS/VS  form  the 
basis  for  an  investigation  of  error  distributions  in  system  programs.  Using  a  classification  of  the  errors  according 
to  various  attributes,  conclusions  can  be  drawn  concerning  the  possible  causes  of  these  errors.  The  information 
thus  obtained  is  applied  in  a  discussion  of  the  most  effective  methods  for  the  detecting  and  prevention  of  errors. 

[Eric85]  Abstract:  Deployment  of  software  controlled  systems  for  providing  communications  services  has  grown 
very  rapidly.  For  example,  a  large  proportion  of  telephone  central  offices  are  now  Stored  Program  Control  Sys¬ 
tems  (SPCS).  In  the  course  of  this  growth,  it  has  been  found  that  software,  like  hardware,  is  subject  to  various 
kinds  of  problems  throughout  the  software  life  cycle  which  may  seriously  affect  the  software  cost,  intended 
delivery  date,  field  performance,  and  inservice  support.  As  a  result,  Bell  Communications  Research,  Inc. 
(BELLCORE)  has  proposed  generic  software  reliability  and  quality  requirements  for  telecommunications 
software  to  meet  typical  telephone  company  needs.  These  requirements  are  intended  to  reduce  software  life 
cycle  costs  by  assuring  that  the  software  is  designed,  developed,  tested,  produced,  installed,  and  supported  in  a 
manner  that  is  consistent  with  modem  software  quality  concepts  and  practices. 

[Evan83a]  Abstract:  Several  studies  have  appeared  in  recent  years  examining  the  sensitivity  of  standard  software 
complexity  metrics  to  common  rules  of  program  structuring.  In  most  cases,  these  studies  found  support  for  the 
use  of  certain  metrics  as  indices  of  program  quality  as  represented  by  program  structure.  In  the  research 
described  in  this  paper,  a  broader  analysis  of  metric  sensitivity  to  the  structuring  rules  was  conducted.  The  con¬ 
clusions  reached  differ  greatly  from  those  previously  advocated  in  the  literature;  i.e.,  the  metrics  under  con¬ 
sideration  are  shown  to  be  relatively  insensitive  to  program  structure. 

[Evan83b]  Abbreviated  Introduction:  In  their  study  of  the  psychological  complexity  of  software,  Curtis,  et.  al., 
remark  that  “no  simple  relationship  between  computational  and  psychological  complexity  is  expected.”  In  the 
discussion  below,  we  elaborate  on  this  observation  through  a  series  of  examples. 

[Evan84a]  Abstract:  The  complexity  of  control  flow  in  a  program  is  generally  believed  to  be  an  important  deter¬ 
minant  of  the  testability  and  the  comprehensibility  of  the  program.  Several  metrics  have  been  proposed  to  meas¬ 
ure  this  aspect  of  complexity,  including  the  nesting  level  metric  of  Harrison  and  Magel  and  the  cyclomatic 
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complexity  of  McCabe.  In  this  paper,  the  theory  underlying  cyclomatic  complexity  is  analyzed  and  shown  to  be 
poorly  developed,  and  the  nesting  level  metric  is  reconstructed  from  a  simpler  conceptual  basis.  In  each  case, 
the  emphasis  is  on  the  need  to  provide  software  metrics  with  an  adequate  theoretical  foundation. 

[Evan84b]  Abbreviated  Preface:  This  book  guides  the  software  manager  through  the  software  testing  morass. 
The  book  will  identify  the  individual  components  and  test  levels  that  must  be  integrated  into  a  cohesive  structure, 
and  outline  how  the  testing  program  is  to  be  planned  and  managed.  It  will  identify  tools,  techniques,  and  metho¬ 
dologies  that  must  be  incorporated  if  testing  is  to  succeed. 

The  book  offers  solutions  to  the  recurring  management  problems  characteristic  of  software  testing, 
which  invariably  turn  into  crises  during  the  later  stages  of  a  software  project.  Problems  all  center  around  a  single 
theme:  The  project  planners  have  not  adequately  estimated  what  will  occur  during  the  testing  period  and  cannot 
control  resources  and  activities  during  this  critical  implementation  stage. 

[Evan84c]  Abstract:  In  a  recent  paper  the  author  has  presented  evidence  that,  contrary  to  several  studies  in  the 
literature,  certain  software  complexity  metrics  are  not  consistently  sensitive  to  the  application  of  program  style 
rules.  In  the  present  paper,  these  results  are  summarized  and  extended  to  two  additional  metrics.  The  new  results 
indicate  both  that  current  complexity  metrics  are  improper  indices  of  program  quality,  as  measured  by  style,  and 
that  many  commonly  used  style  rules  do  not  address  questions  of  minimizing  interprocedural  information  Sow 
complexify. 

[Faga76]  Abstract:  Substantial  net  improvements  in  programming  qualify  and  productivity  have  been  obtained 
through  the  use  of  formal  inspections  of  design  and  of  code.  Improvements  are  made  possible  by  a  systematic 
and  efficient  design  and  code  verification  process,  with  well-defined  roles  for  inspection  participants.  The 
manner  in  which  inspection  data  is  categorized  and  made  suitable  for  process  analysis  is  an  important  factor  in 
attaining  the  improvements.  It  is  shown  that  by  using  inspection  results,  a  mechanism  for  initial  error  reduction 
followed  by  ever-improving  error  rates  can  be  achieved. 

[Faga86]  Abstract:  This  paper  presents  new  studies  and  experiences  that  enhance  the  use  of  the  inspection  pro¬ 
cess  and  improve  its  contribution  to  development  of  defect-free  software  on  time  and  at  lower  costs.  Examples 
of  benefits  are  cited  followed  by  descriptions  of  the  process  and  some  methods  of  obtaining  the  enhanced 
results. 

Software  inspection  is  a  method  of  static  testing  to  verify  that  software  meets  its  requirements.  It  engages 
the  developers  and  others  in  a  formal  process  of  investigation  that  usually  detects  more  defects  in  the  product- 
and  at  lower  cost-than  does  machine  testing.  Users  of  the  method  report  very  significant  improvements  in  quality 
that  are  accompanied  by  lower  development  costs  and  greatly  reduced  maintenance  efforts.  Excellent  results 
have  been  obtained  by  small  and  large  organizations  in  all  aspects  of  new  development  as  well  as  in  maintenance. 
There  is  some  evidence  that  developers  who  participate  in  the  inspection  of  their  own  product  actually  create 
fewer  defects  in  future  work.  Because  inspections  formalize  the  development  process,  productivity  and  quality 
enhancing  tools  can  be  adopted  more  easily  and  rapidly. 

[Fair75]  Abstract:  This  paper  describes  an  experimental  program  testing  facility  called  the  interactive  semantic 
modeling  system  (ISMS).  Die  ISMS  is  designed  to  allow  experimentation  with  a  wide  variety  of  tools  for  collect¬ 
ing,  analyzing,  and  displaying  testing  information.  The  design  methodology  is  applicable  to  procedural  program¬ 
ming  languages,  and  Algol  60  is  being  used  as  the  vehicle  for  elaboration  of  design  principles  and  implementation 
techniques. 

This  paper  discusses  the  ISMS  design,  and  describes  the  various  types  of  analysis  and  display  tools  being 
developed  to  facilitate  program  testing.  The  ISM  Preprocessor  is  described,  and  an  example  is  presented  to  illus¬ 
trate  the  data  structures  utilized  in  the  ISMS. 

[Fair79]  Abstract:  ALADDIN  is  an  interactive  facility  for  debugging  and  testing  of  assembly  language  pro¬ 
grams.  ALADDIN  differs  from  traditional  debuggers  by  allowing  the  user  to  specify  breakpoint  assertions, 
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rather  than  breakpoint  locations.  Assertions  are  logical  relations  among  various  components  of  the  program 
state.  If  an  assertion  becomes  false  during  execution  of  the  object  program  a  breakpoint  is  executed  and  control 
is  passed  to  the  user's  terminal.  ALADDIN  can  also  be  used  as  a  testing  tool  to  verify  that  asserted  behavior 
matches  actual  behavior  under  various  sets  of  input  data  and  test  conditions. 

[Farr83]  Abstract:  With  the  ever-increasing  role  that  software  is  playing  in  the  weapon  systems,  a  great  need  has 
arisen  for  tools  that  are  useful  in  developing  cost-effective  software.  An  area  of  research  has  arisen  over  the  last 
10  years  in  providing  a  software  manager  quantitative  statements  about  the  reliability  of  the  software.  Using  this 
quantitative  measure,  the  manager  can  make  a  determination  of  when  software  testing  should  terminate  and  how 
to  best  utilize  testing  personnel.  This  report  discusses  the  various  approaches  that  have  been  advocated  for  relia¬ 
bility  estimation.  It  reviews  the  various  model  assumptions,  the  estimates  of  reliability,  the  precision  of  those 
estimates,  and  the  data  required  for  their  implementation.  A  comparison  is  then  made  among  some  of  these 
models  based  upon  studies  that  have  been  done.  General  comments  concerning  software  reliability  implementa¬ 
tion  are  discussed  in  the  final  section  of  the  report. 

[Farr88]  Abstract:  The  concept  of  software  reliability  and  its  measurement  is  receiving  a  lot  of  attention  in  the 
software  development  community.  With  the  ever  increasing  role  that  software  is  playing  in  today’s  and  tomor¬ 
row’s  society,  software  developers  and  users  are  asking:  “Just  how  ‘good’  is  the  software?’’  and  “how  much  test¬ 
ing  should  be  done  before  the  software  is  released?”  The  software  reliability  methodology  attempts  to  provide 
quantitative  measures  to  help  answer  these  questions.  Unfortunately,  to  arrive  at  these  measures  requires  com¬ 
plex  numerical  computations,  usually  requiring  the  assistance  of  a  computer.  A  software  reliability  analysis  tool 
called  the  “Statistical  Modeling  and  Estimation  of  Reliability  Functions  for  Software  (SMERFS)”  was 
developed  several  years  ago  for  this  purpose.  Originally,  the  tool  was  designed  for  a  mainframe/mini  computer 
environment.  This  paper  describes  an  adaptation  of  that  tool  for  the  personal  computer  (PC)  and  relates  how  it 
differs  from  the  one  for  larger  computer  systems. 

The  PC  version  of  SMERFS  includes  several  features  which  are  illustrated  in  this  paper,  using  as  data 
either  time-be tween-error  occurrence  (wall  clock  or  central  processing  unit)  or  error  counts  per-time-period. 
These  features  are  data  input  and  data  management,  editing  capability,  data  transformations,  model  fitting  for 
software  reliability  measurement,  and  features  of  the  software  that  allow  the  user  to  determine  the  adequacy  of  fit 
as  well  as  aids  in  determining  the  best  model.  The  current  version  of  SMERFS  has  incorporated  eight  software 
reliability  models.  These  models  include  the  following:  John  Musa’s  Execution  Model,  Goel’s  Non-homogene- 
ous  Poisson  Models  using  both  types  of  data,  the  Geometric  Model,  a  Generalized  Poisson  Model, 
Schneidewind’s  Model,  and  Brooks  and  Motley’s  Model. 

l 

[Feat89]  Abstract:  Constructing  specifications  of  complex  tasks  is  often  a  laborious  activity  in  spite  of  the  rich 
vocabulary  provided  by  specification  languages.  An  incremental  approach  to  construction  is  proposed,  with  the 
virtue  of  offering  considerable  opportunity  for  mechanized  support.  Following  this  approach  one  builds  a  specifi-  ' 
cation  through  a  series  of  elaborations  that  incrementally  adjust  a  simple  initial  specification.  Elaborations  per¬ 
form  both  refinements,  adding  further  detail,  and  adaptations,  retracting  over  simplifications  and  tailoring 
approximations  to  the  specifics  of  the  task.  It  is  anticipated  that  the  vast  majority  of  elaborations  can  be  con¬ 
cisely  described  to  a  mechanism  which  will  then  perform  them  automatically.  When  elaborations  are  indepen¬ 
dent,  they  can  be  applied  in  parallel,  leading  to  diverging  specifications  which  must  later  be  recombined. 

The  approach  is  intended  to  facilitate  comprehension  and  maintenance  of  specifications,  as  well  as  their 
initial  construction.  The  advantages  of  following  this  approach  stem  from  the  gradual  nature  of  the  elaboration 
process,  the  separation  of  concerns  through  following  independent  elaborations  in  parallel,  the  simplicity  of  the 
individual  elaboration  steps  (the  effects  of  each  step  are  well  delineated),  and  the  availability  of  an  explicit  record 
of  construction. 

[Feld89]  Abstract:  The  advent  of  high-resolution  graphics  workstations  at  reasonable  cost  offers  great  potential 
in  the  development  of  high-level,  graphics-oriented  debugging  tools.  The  advent  of  programming  languages, 
Ada,  in  particular,  which  support  concurrency  with  high-level  primitives,  affords  the  opportunity  to  develop  new 
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models  of  debugging  for  programs  incorporating  concurrent  tasks.  The  marriage  of  graphics  and  concurrency- 
oriented  debugging  can  provide  powerful  tools  indeed. 

We  have  development  a  demonstration-quality  graphics-assisted  debugger  for  intertask  communication  in 
Ada.  Based  on  the  static  task-specification  diagrams  of  Booch,  the  debugger  animates  the  activity  of  a  collection 
of  communicating  tasks,  and  runs  on  a  DEC  GIGI  terminal  connected  to  a  VAX  11-780  under  TeleSoft’s  Ada 
compiler. 

The  model  has  been  subjected  to  empirical  validation,  using  under-graduate  students  as  experimental  sub¬ 
jects.  Subjects  were  required  to  debug  erroneous  tasking  programs  using  both  the  graphical  debugger  and  a  tex¬ 
tual  one. 

[Ferr77]  Abstract:  State  machines  provide  a  convenient  and  indispensable  mathematical  framework  for  defin¬ 
ing  precise  specifications  of  complex  software  systems.  Such  specifications  stand  as  pivotal  elements  between 
requirements  and  designs,  and  permit  the  definition  of  three  levels  of  correctness  in  software  systems,  namely  1) 
interpretation  correctness  between  requirements  and  specification,  2)  program  correctness  between  specifica¬ 
tion  and  design,  and  3)  implementation  correctness  between  design  and  programmed  hardware.  Formal  tech¬ 
niques- already  exist  for  program  correctness  and  implementation  correctness.  However,  there  is  a  need  for  for¬ 
mal  techniques  for  interpretation  correctness;  methods  of  semantics  are  proposed  as  a  basis  for  such  techniques. 

[Fetz88]  Abbreviated  Introduction:  The  notion  of  program  verification  appears  to  trade  upon  an  equivocation. 
Algorithms,  as  logical  structures,  are  appropriate  subjects  for  deductive  verification.  Programs,  as  causal  models 
of  those  structures,  are  not.  The  success  of  program  verification  as  a  generally  applicable  and  completely  reliable 
method  for  guaranteeing  program  performance  is  not  even  a  theoretical  possibility. 

[Feue79a]  Abstract:  It  is  no  longer  a  surprise  that  the  program  maintenance  dominates  the  total  cost  of  a  large 
software  system  over  its  lifetime.  In  response  to  these  costs,  the  emphasis  in  program  design  has  largely  shifted 
from  the  time  and  space  issues  of  machine  efficiency  to  issues  of  clear  and  flexible  program  structures  that  can  be 
easily  maintained. 

The  goal  of  this  project  is  to  identify  measurable  program  properties  that  influence  maintainability.  More 
precisely,  we  examine  the  effect  of  various  program  characteristics  on  the  subsequent  frequency  and  magnitude 
of  program  errors. 

[Fisc77]  Abstract:  The  literature  in  the  areas  of  software  management  and  software  engineering  admit  to  a  possi¬ 
ble  reduction  in  the  reliability  of  software  after  modifications  have  been  made.  Validation  of  maintenance  modifi¬ 
cations  is  commonly  referred  to  as  retest,  and  has  yet  to  be  adequately  resolved.  The  problem  is  how  to  effi¬ 
ciently  select  previously  run  test  cases  to  be  rerun  on  the  software  to  assure  no  degradation  of  reliability.  This 
paper  develops  several  alternative  retest  philosophies  and  identifies  a  common  operations  research  technique  for 
solution.  Detailed  examples  show  how  0-1  integer  programming  can  identify  a  minimum  number  of  previously 
executed  tests  necessary  to  fully  retest  every  affected  program  element  at  least  once.  Use  of  this  model  to  deter¬ 
mine  proper  selection  of  test  cases  can  reduce  the  cost  of  software  maintenance  and  increase  confidence  in  the 
reliability  of  the  code. 

[Fitz78a]  Abstract:  During  recent  years,  there  have  been  many  attempts  to  define  and  measure  the  “complex¬ 
ity”  of  a  computer  program.  Maurice  Halstead  has  developed  a  theory  that  gives  objective  measures  of  software 
complexity.  Various  studies  and  experiments  have  shown  that  the  theory’s  predictions  of  the  number  of  bugs  in 
programs  and  of  the  time  required  to  implement  a  program  are  amazingly  accurate.  It  is  a  promising  theory 
worthy  of  much  more  probing  scientific  investigation. 

This  paper  reviews  the  theory,  called  “software  science,”  and  the  evidence  supporting  it.  A  brief  descrip¬ 
tion  of  a  related  theory,  called  “software  physics,”  is  included. 

[Flon77]  Abstract:  This  thesis  applies  and  extends  mathematical  program  verification  to  systems  programs.  The 
design  methodology  is  based  upon  the  use  of  abstract  data  types  and  the  construction  and  verification  of  both 
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specifications  and  implementations  for  them.  The  abstract  data  type  is  a  means  of  modularization  which  encap¬ 
sulates  the  representation  of  a  data  structure  and  the  algorithms  which  operate  directly  upon  it.  The  specification 
technique  appeals  to  various  mathematical  structures  (e.g.  sets  and  sequences)  to  describe  an  abstract  state  for 
objects  of  a  given  type.  The  correctness  of  the  formal  specifications  is  cast  in  terms  of  the  proof  rule  is  given  to 
formulate  the  theorems  necessary  for  proving  the  invariance  of  predicates  across  formal  specifications.  The 
applicability  of  the  methodology  to  operating  systems  is  explored.  It  is  found  that  a  hierarchical  decomposition  is 
most  amenable  to  verification,  and  that  the  implementation  language  used  is  a  function  of  that  hierarchy.  The 
example  of  a  process  dispatcher  module  of  a  hypothetical  operating  system  is  used  to  illustrate  the  process  of 
design,  specification,  implementation,  and  verification  using  the  methodology.  Various  properties  are  proven  of 
the  abstract  specifications,  including  one  representation  of  the  concept  of  fair  service.  Programs  are  then  written 
for  the  specifications  and  their  correctness  is  verified. 

[Flon78a]  Abstract:  We  present  weakest  pre-conditions  which  describe  weak  correctness,  blocking,  deadlock, 
and  starvation  for  nondeterministic  programs.  A  procedure  for  converting  parallel  programs  to  nondeterministic 
programs  is  described,  and  the  correctness  of  various  example  parallel  programs  is  treated  in  this  manner. 
Among  these  are  a  busy-wait  mutual  exclusion  scheme,  and  the  problem  of  the  Five  Dining  Philosophers. 

[Flon81]  Abstract:  We  describe  a  formal  theory  of  the  total  correctness  of  parallel  programs,  including  such 
heretofore  theoretically  incomplete  properties  as  safety  from  deadlock  and  starvation  under  fair-scheduling.  We 
present  a  sound  and  complete  set  of  proof  rules  for  the  total  correctness  of  parallel  programs  expressed  in  non¬ 
deterministic  form. 

The  proof  of  soundness  and  completeness  is  novel  in  that  we  show  that  the  weakest  pre-conditions  for  the 
correctness  criteria  are  actually  fixed-points  (least  or  greatest)  of  continuous  functions  over  the  complete  lattice 
of  total  predicates.  We  have  obtained  proof  rule  schemata  which  can  universally  be  applied  to  least  or  greatest 
fixed-points  of  continuous  functions.  Therefore,  a  system  of  proof  rules  is  a  priori  sound  and  complete  once  it  is 
shown  that  certain  weakest  pre-conditions  are  extremum  fixed-points.  The  relationship  between  true  parallelism 
and  nondeterminism  is  also  discussed. 

[Form77]  Abstract:  In  this  article  we  introduce  the  problem  of  computer  software  reliability  and  discuss  a  pro¬ 
babilistic  model  for  describing  the  failure  of  software.  We  suggest  a  procedure  for  estimating  the  parameters  of 
the  model  and  propose  a  stopping  rule  for  debugging  the  software.  We  apply  our  procedure  to  some  published 
data  on  software  failures. 

[Form79]  Abstract:  This  paper  discusses  certain  stochastic  aspects  of  the  software  reliability  problem.  First  an 
empirical  stopping  rule  for  debugging  and  testing  computer  software  is  discussed.  Then  some  results  are 
presented  on  choosing  a  time  interval  for  testing  the  hypothesis  that  a  software  system  contains  no  errors,  given 
certain  costs  and  risk  constraints. 

[Fosd76a]  Abstract:  The  ways  that  the  methods  of  data  flow  analysis  can  be  applied  to  improve  software  reliabil¬ 
ity  are  described.  There  is  also  a  review  of  the  basic  terminology  from  graph  theory  and  from  analysis  in  global 
program  optimization.  The  notation  of  regular  expressions  is  used  to  describe  actions  on  data  for  sets  of  paths. 
These  expressions  provide  the  basis  of  a  classification  scheme  for  data  flow  which  represents  patterns  of  data 
flow  along  paths  within  subprograms  and  along  paths  which  cross  subprogram  boundaries.  Fast  algorithms,  origi¬ 
nally  introduced  for  global  optimization,  are  described  and  it  is  shown  how  they  can  be  used  to  implement  the 
classification  scheme.  It  is  then  shown  how  these  same  algorithms  can  also  be  used  to  detect  the  presence  of  data 
flow  anomalies  which  are  symptomatic  of  programming  errors.  Finally,  some  characteristics  of  an  experience 
with  Dave,  a  data  flow  analysis  system  embodying  some  of  these  ideas,  are  described. 

[Fosd76b]  Abstract:  In  an  earlier  paper,  the  authors  have  defined  type  1  and  type  2  data  flow  anomalies  to  be, 
respectively,  the  reference  to  an  undefined  variable  and  the  definition  of  a  variable  without  subsequent  reference. 
It  is  not  difficult  to  devise  search  techniques  to  detect  such  anomalies  when  the  anomalous  data  flow  is  contained 
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in  a  single  procedure.  When  the  data  flow  crosses  procedure  boundaries,  however,  many  difficulties  may  arise.  In 
this  paper,  we  carefully  define  the  conditions  under  which  interprocedural  anomalies  occur.  We  also  show  how 
algorithms  currently  used  in  global  program  optimization  can  easily  be  adapted  to  yield  highly  efficient  algorithms 
for  the  detection  of  such  interprocedural  anomalies. 

[Fost80]  Abstract:  A  hardware  failure  analysis  technique  adapted  to  software  yielded  three  rules  for  generating 
test  cases  sensitive  to  code  errors.  These  rules,  and  a  procedure  for  generating  these  cases,  are  given  with  exam¬ 
ples.  Areas  for  further  study  are  recommended. 

[Fost83]  Introduction:  The  subject  paper  uses  the  phrase  “error-sensitive”  and  includes  the  [initial]  ESTCA 
paper  in  its  references.  The  latter  paper  uses  the  same  phrase  to  describe  a  different  method.  This  comment 
challenges  the  claim  that  domain  t  sting  is  a  “more  promising”  error  detection  strategy.  Figures  in  the  subject 
paper  are  referenced  but  not  reproduced.  Therefore  the  comment  cannot  be  verified  without  that  paper  at  hand. 

[Fost84]  Abstract:  The  usual  approach  to  testing  software  with  logic  expressions  considers  that  n  variables  can 
realize  2'(2'2n)  functions.  Therefore,  to  distinguish  one  Boolean  expression  in  n  variables  (a  “correct”  one) 
from  all  others  (“errors”)  2*n  tests  are  necessary. 

[Fost85]  Abbreviated  Introduction:  This  note  revises  and  re-substantiates  the  original  ESTCA  rule  [Fost80]  for 
generating  or  evaluating  test  data  for  simple  conditional  or  comparison  expressions.  Comments  on  other 
methods  are  included. 

The  rules  apply  to  the  lowest  program  component  -  “variable  operator  variable-or-constant”  expressions. 
Clearly,  if  any  part  of  any  expression  is  incorrect,  then  so  is  the  program.  Rules  developed  in  this  continuing 
study  are  intended  to  identify  data  that  meets  error  sensitivity  criteria  for  all  expression  types.  Rules  for  logic 
expressions  and  combinations  of  comparisons  are  in  (Fost84). 

[Fran80]  Abstract:  Discussed  is  a  distributed  system  based  on  communication  among  disjoint  processes,  where 
each  process  is  capable  of  achieving  a  post-condition  of  its  logic  space  in  such  a  way  that  the  conjunction  of  local 
post-conditions  implies  a  global  post-condition  of  the  whole  system.  The  system  is  then  augmented  with  extra 
control  communication  in  order  to  achieve  distributed  termination,  without  adding  new  channels  of  communica¬ 
tion.  The  algorithm  is  applied  to  a  problem  of  constructing  a  sorted  partition. 

[Fran85a]  Abstract:  This  paper  describes  ASSET,  a  tool  which  uses  information  about  a  program’s  data  flow  to 
aid  in  selecting  test  data  for  the  program  and  to  evaluate  test  data  adequacy.  ASSET  is  based  on  the  family  of 
data  flow  test  selection  and  test  data  adequacy  criteria  developed  by  Rapps  and  Weyuker.  ASSET  accepts  as 
input  a  program  written  in  a  subset  of  Pascal,  a  set  of  test  data,  and  one  of  the  data  flow  adequacy  criteria  and 
indicates  to  what  extent  the  criterion  has  been  satisfied  by  the  test  data. 

[Fran86]  Abstract:  Most  test  data  adequacy  criteria  based  upon  path  selection  have  the  unfortunate  property 
that  for  some  programs  with  unexecutable  paths,  no  set  of  test  data  is  adequate.  In  this  paper  we  define  a  new 
family  of  adequacy  criteria,  derived  from  the  data  flow  testing  criteria,  which  circumvent  this  problem  by  only 
requiring  the  test  data  to  exercise  those  definition-use  associations  which  are  executable.  The  inclusion  relation¬ 
ship  among  these  criteria  is  explored. 

[Fran88]  Abstract:  A  test  data  adequacy  criterion  is  a  predicate  which  is  used  to  determine  whether  a  program 
has  been  tested  “enough.”  An  adequacy  criterion  is  applicable  if  for  every  program  there  exists  a  set  of  test  data 
for  the  program  which  satisfies  the  criterion.  Most  test  data  adequacy  criteria  based  on  path  selection  fail  to 
satisfy  the  applicability  property  because,  for  some  programs  with  unexecutable  paths,  no  adequate  set  of  test 
data  exists.  In  this  paper,  we  extend  the  definitions  of  the  previously  introduced  family  of  data  flow  testing  cri¬ 
teria  to  apply  to  programs  written  in  a  large  subset  of  Pascal.  We  then  define  a  new  family  of  adequacy  criteria 
called  feasible  data  flow  testing  criteria,  which  are  derived  from  the  data  flow  testing  criteria.  The  feasible  data 
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flow  testing  criteria  circumvent  the  problem  of  nonapplicability  of  the  data  flow  testing  criteria  by  requiring  the 
test  data  to  exercise  only  those  definition-use  associations  which  are  executable.  We  show  that  there  are  signifi¬ 
cant  differences  between  the  relationships  among  the  feasible  data  flow  testing  criteria. 

We  also  discuss  a  generalized  notion  of  the  executability  of  a  path  through  a  program  unit.  A  script  of  a 
testing  session  using  our  data  flow  testing  tool,  ASSET,  is  included  in  the  Appendix. 

[FuJiTT]  Abstract:  Verification  is  presented  as  a  method  of  ensuring  high  reliability  of  software  systems.  Verifica¬ 
tion  consists  of  an  early  analysis  of  program  requirements  and  design  specifications,  followed  by  extensive  pro¬ 
gram  analysis  and  system/program  execution  testing.  It  is  performed  in  parallel  with  software  development  by  an 
organization  independent  of  the  development  group,  with  the  objective  of  detecting  conceptual  and  implementa¬ 
tion  errors  before  program  acceptance.  Software  analysis  and  testing  aids  for  cost-effectively  automating  routine 
tasks  are  described.  The  results  of  several  verification  projects  are  discussed  to  illustrate  common  types  of  errors 
and  techniques  for  their  detection.  Key  aspects  of  verification  planning  are  presented  for  projects  that  are 
required  to  achieve  highly  reliable  software. 

[Gabo76]  Abstract:  In  this  paper  we  analyze  the  complexity  of  algorithms  for  two  problems  that  arise  in 
automatic  test  path  generation  for  programs:  the  problem  of  building  a  path  through  a  specified  set  of  program 
statements  and  the  problem  of  building  a  path  which  satisfies  impossible-pairs  restrictions  on  statement  pairs. 
These  problems  are  both  reduced  to  graph  traversal  problems.  We  give  an  efficient  algorithm  for  the  first,  and 
show  that  the  second  is  NP-complete. 

[Gafif79]  Abstract:  McCabe  has  shown  that  the  complexity  of  a  computer  program  may  be  defined  as  the 
cyclomatic  number  of  the  graph  theoretic  representation  of  its  control  structure  and  that  that  number  (less  one) 
is  equal  to  the  number  of  conditional  jumps  (J)  in  the  structure.  It  can  be  shown  that  a  range  of  values  for  this  fig¬ 
ure  can  be  derived  from  an  estimate  of  the  number  of  inputs  and  outputs  in  a  program,  such  as  obtainable  from 
its  specification.  If  ^  Inputs  -  -  Outputs  -  N,  then  the  least  value  of  J  (=/min)  -  N  log2  N.  This  figure  is  derived 
from  information  theoretic  arguments.  It  is  noted  to  be  equal  to  the  least  number  of  switches  in  a  telephone 
exchange  and  to  the  least  number  of  operations  in  a  sort.  If  we  know  that  a  program  whose  size  is  to  be  estimated 
is  expected  to  be  similar  to  others  in  which  the  proportion  of  conditional  jumps  is  “L,”  then  the  minimum 
number  of  instructions  would  be  equal  to  L*/^,  while  the  more  likely  number  would  be  L*/m.T,  where  J ^  = 
N  .  This  formulation  is  compared  with  several  of  Halstead’s  formulas  which  can  be  used  to  estimate  the  size  of  a 
program  using  estimates  of  the  number  of  input/output  parameters,  the  numbers  of  operators,  and  the  language 
level.  Comparisons  are  made  using  the  new  “complexity” -based  estimators,  and  those  of  Halstead. 

[GaffBla]  Abstract:  This  paper  describes  some  of  the  potential  for  applying  software  metrics  to  the  management 
of  the  software  development  process.  It  also  considers  some  of  the  practical  difficulties  one  typically  faces  in 
evolving  and  validating  a  software  metric.  One  difficulty  is  the  collection  of  baseline  data  in  the  real  world  of 
software  production  in  which  controlled  experiments  typically  are  not  possible.  The  results  of  some  recent  quan¬ 
titative  ‘metrics’  investigations  are  presented  and  their  practical  implementations  for  software  estimation  and 
control  are  cited.  These  investigations  are  thought  to  be  representative  of  the  process  of  evaluating  software  data 
not  obtained  under  ‘controlled’  conditions  such  as  is  typically  the  situation  in  the  natural  science  laboratory. 

[GaflSlb]  Abstract:  The  nature  of  “software  quality”  and  some  software  metrics  are  defined  and  their  relation¬ 
ship  to  traditional  software  indicators  such  as  “maintainability”  and  “reliability”  are  suggested.  Recent  work  in 
the  field  is  summarized  and  an  outlook  for  software  metrics  in  quality  assurance  is  provided.  The  material  was 
originally  presented  as  a  tutorial  at  the  “ACM  SIGMETRICS  Workshop/Symposium  of  Measurement  and 
Evaluation  of  Software  Quality”  on  March  25, 1981. 

[Gaff88]  Abstract:  Availability  is  a  significant  measure  of  software  performance.  A  system’s  availability  is  a 
function  of  the  availability  of  its  software  component  which  is  directly  related  to  the  number  of  errors  remaining 
in  it  at  delivery,  the  latent  error  content.  This  paper  presents  a  method  for  estimating  the  latent  error  content  of 
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an  element  of  software;  this  can  be  done  commencing  with  data  obtained  during  design  and  code  inspections. 
The  availability  of  the  software  unit,  then,  is  a  function  of  the  rate  of  discovery  of  these  errors. 

[GaU81]  Abstract:  In  the  second  part  of  this  work,  the  author  formulates  a  new  inductive  assertion  method 
applying  to  the  class  of  nondeterministic  flowchart  programs  with  recursive  procedures  studied  in  part  1.  Using 
results  on  unfolding  proved  in  part  1,  he  proves  that  this  method  is  sound  and  complete  with  a  finite  number  of 
assertions.  He  studies  four  notions  of  correctness:  two  notions  of  partial  correctness  (existential  and  universal) 
and  the  corresponding  notions  of  total  correctness.  He  also  formalizes  two  notions  of  extension  and  equivalence 
(existential  and  universal)  in  the  second-order  predicate  calculus. 

[Gann75]  Abstract:  The  language  in  which  programs  are  written  can  have  a  substantial  effect  on  their  reliability. 
This  paper  discusses  the  design  of  programming  languages  to  enhance  reliability.  It  presents  several  general 
design  principles,  and  then  applies  them  to  particular  languages  constructs.  Since  we  can  not  logically  prove  the 
validity  of  such  design  principles,  empirical  evidence  is  needed  to  support  or  discredit  them.  Gannon  has  per¬ 
formed  a  major  experiment  to  measure  the  effect  of  nine  specific  language-design  decisions  in  one  context. 
Analysis  of  the  frequency  and  persistence  of  errors  shows  that  several  decisions  had  a  significant  impact  on  relia¬ 
bility. 

[Gann76]  Abstract:  The  goal  of  reliable  programming  is  to  minimize  the  number  of  errors  in  completed  pro¬ 
grams.  The  language  in  which  programs  are  written  can  have  a  significant  impact  on  the  reliability  of  the  pro¬ 
gramming  process. 

One  common  language  feature  that  appears  in  different  forms  in  many  programming  languages  is  the  data 
type.  Data  types  may  be  associated  with  operands  in  one  of  three  ways:  statically  typed  operands.  Furthermore, 
programmers  trying  to  convert  an  operand  from  one  type  to  another,  who  were  forced  to  grapple  with  the 
representation  of  the  operand,  committed  errors  that  cast  doubt  upon  the  ability  of  programmers  to  write  reli¬ 
ably  in  a  language  which  treats  its  operands  as  collections  of  bits. 

This  preliminary  data  suggests  that  a  more  detailed  and  controlled  experiment  will  be  required  to  enable 
us  to  draw  firm  conclusions  about  the  role  of  data  types  in  reliable  programming.  A  proposal  for  such  an  experi¬ 
ment  is  described  in  the  paper. 

[Gann77]  Abstract:  The  language  in  which  programs  are  written  can  have  a  substantial  effect  on  the  reliability  of 
the  resulting  programs.  This  paper  discusses  an  experiment  that  compares  the  programming  reliability  of  sub¬ 
jects  using  a  statically  typed  language  and  a  “typeless”  language.  Analysis  of  the  number  of  errors  and  the 
number  of  runs  containing  errors  shows  that,  at  least  in  one  environment,  the  use  of  a  statically  typed  language 
can  increase  programming  reliability.  Detailed  analysis  of  the  errors  made  by  the  subjects  in  programming  solu¬ 
tions  to  reasonably  small  problems  shows  that  the  subjects  had  difficulty  manipulating  the  representation  of  data. 

[Gann79]  Abstract:  Two  software  testing  techniques-static  analysis  and  dynamic  path  (branch)  testing-are  receiv¬ 
ing  a  great  deal  of  attention  in  the  world  of  software  engineering  these  days.  However,  empirical  evidence  of  their 
ability  to  detect  errors  is  very  limited,  as  is  data  concerning  the  resource  investment  their  use  requires.  Research¬ 
ers  such  as  Goodenough  and  Howden  have  estimated  or  graded  these  testing  methods,  as  well  as  such  other  tech¬ 
niques  as  interface  consistency,  symbolic  testing,  and  special  values  testing.  However,  this  paper  seeks  to  demon¬ 
strate  empirically  the  types  of  errors  one  can  expect  to  uncover  and  to  measure  the  engineering  and  computer 
time  which  may  be  required  by  the  two  testing  techniques  for  each  class  of  errors  during  system-level  testing. 

[Gann80]  Abbreviated  Introduction:  A  series  of  articles  have  made  the  data  type  “traversable  stack”  something 
of  a  cause  celebre.  At  the  University  of  Maryland  we  are  constructing  a  system  called  DAISTS  (Data  Abstrac¬ 
tion  Implementation,  Specification,  and  Testing  System)  for  testing  data  abstraction  implementations,  and  here 
report  how  DAISTS  fared  on  two  traversable  stack  specifications.  We  concluded  that  subtle  errors  (by  defini¬ 
tion  those  undetected  by  an  author  in  her  or  his  own  work)  are  sometimes  easy  to  discover  through  testing,  but 
that  other,  more  obvious  mistakes  may  slip  by  tests. 
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[Gann81]  Abstract:  A  compiler-based  system  DA1STS  that  combines  a  data-abstraction  implementation 
language  (derived  from  the  SIMULA  class)  with  specification  by  algebraic  axioms  is  described.  The  compiler, 
presented  with  two  independent  syntactic  objects  in  the  axioms  and  implementing  code,  compiles  a  “program” 
that  consists  of  the  former  as  test  driver  for  the  latter.  Data  points,  in  the  form  of  expressions  using  the  abstract 
functions  and  constant  values,  are  fed  to  this  program  to  determine  if  the  implementation  and  axioms  agree. 
Along  the  way,  structural  testing  measures  can  be  applied  to  both  code  and  axioms  to  evaluate  the  test  data. 
Although  a  successful  test  does  not  conclusively  demonstrate  the  consistency  of  axioms  and  code,  in  practice  the 
tests  are  seldom  successful,  revealing  errors.  The  advantage  over  conventional  programming  systems  is  three¬ 
fold: 

1.  The  presence  of  the  axioms  eliminates  the  need  for  a  test  oracle;  only  inputs  need  be  supplied. 

2.  Testing  is  automated:  a  user  writes  axioms,  implementation,  and  test  points;  the  system  writes  the  test  drivers. 

3.  The  results  of  tests  are  often  surprising  and  helpful  because  it  is  difficult  to  get  away  with  “trivial”  tests:  what  is 
not  significant  for  the  code  is  liable  to  be  severe  test  of  the  axioms,  and  vice  versa. 

[Gann85]  Abstract:  Packages  are  one  of  the  primary  features  of  Ada.  They  can  be  used  to  group  declarations  or 
subprograms  or  to  create  new  encapsulated  types.  Metrics  are  presented  which  characterize  the  use  of  Ada  pack¬ 
ages,  indicating  where  program  structure  may  make  changes  difficult,  and  suggesting  how  the  structure  may  be 
improved.  The  use  of  such  metrics  should  aid  in  the  transition  to,  and  better  use  of  Ada.  The  metrics  are  applied 
to  examples  of  a  ground-support  satellite  system. 

[Gann86]  Abstract:  Modules  allow  programmers  to  group  related  data  and/or  procedures  and  to  limit  the 
amount  of  information  that  is  accessible  to  the  rest  of  the  program.  Splitting  a  program  into  modules  should 
localize  the  effects  of  program  changes  to  correct  errors  or  to  improve  the  implementation  (i.e.,  making  it  more 
robust  or  more  efficient).  In  addition,  since  modules  are  usually  self-contained,  they  can  be  reused  from  project 
to  project.  The  designers  of  Ada  recognized  three  major  uses  for  modules: 

1.  A  named  collection  of  declarations  that  makes  a  group  of  types  and  variables  available  much  like  a  Fortran 
common  block; 

2.  A  group  of  related  subprograms  that  provides  a  library  facility; 

3.  An  encapsulated  data  type  that  provides  the  names  of  the  type  and  it  operations,  but  hides  the  details  of  the 
representation  of  objects  of  the  type  and  implementation  of  the  type’s  operations. 

While  the  first  two  uses  are  familiar  to  many  programmers,  the  third  use  is  not  supported  by  many  commonly 
used  programming  languages.  Strong  syntactic  clues  are  available  to  help  programmers  decide  what  objects 
comprise  the  first  two  kinds  of  modules  (e.g.,  all  types  and  constants,  a  collection  of  global  variables,  or  a  set  of 
utility  routines),  but  fewer  hints  are  available  to  aid  in  grouping  objects  in  problem-oriented  terms.  Deciding 
what  objects  to  encapsulate  in  a  system  is  a  formidable  challenge. 

[Garc84]  Abstract:  In  this  paper  we  discuss  the  issues  involved  in  debugging  a  distributed  computing  system.  We 
describe  the  major  differences  between  debugging  a  distributed  system  and  debugging  a  sequential  program.  We 
suggest  a  methodology  for  distributed  debugging,  and  we  propose  various  tools  or  aids. 

[Garm81]  Discussion  of  the  software  problem  which  delayed  the  first  Shuttle  orbital  Sight. 

[Geig79]  Abbreviated  Introduction:  The  validation  strategy  presented  here  can  be  considered  as  a  first  step 
toward  proving  the  correct  functioning  of  a  real-time  software  system,  in  this  case  for  an  advanced  computerized 
nuclear  reactor  protection  system.  It  may  also  serve  as  a  guideline  for  the  systematic  validation  and  testing  of 
other  safety  oriented  systems. 

[G«1178]  Abstract:  Proofs  of  program  correctness  tend  to  be  long  and  tedious,  whereas  testing,  though  useful  in 
detecting  errors,  usually  does  not  guarantee  correctness.  This  paper  introduces  a  technique  whereby  test  data 
can  be  used  in  proving  program  correctness.  In  addition  to  simplifying  the  process  of  proving  correctness,  this 
method  simplifies  the  process  of  providing  accurate  specification  for  a  program.  The  applicability  of  this 
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technique  to  procedures  and  recursive  programs  is  demonstrated. 

[Gelp79]  Introduction:  Testing,  as  practiced  today,  is  almost  exclusively  concerned  with  the  verification  of 
presently-required  function.  The  purpose  of  this  note  is  to  focus  on  future  function,  i.e.  change,  and  to  propose 
that  the  concerns  of  testing  be  broadened  to  include  maintainability.  What  follows  is  meant  to  clarify  this  propo¬ 
sal  and  to  suggest  some  maintenance  testing  methodologies  in  order  to  stimulate  research. 

[Gelp88]  Abbreviated  Introduction:  We  can  trace  the  evolution  of  software  test  engineering  by  examining 
changes  in  the  testing  process  model  and  the  level  of  professionalism  over  the  years.  The  current  definition  of  a 
good  software  testing  practice  involves  some  preventive  methodology. 

[G«rh76a]  Abstract:  Errors,  inconsistencies,  or  confusing  points  are  noted  in  a  variety  of  published  algorithms, 
many  of  which  are  being  used  as  examples  in  formulating  or  teaching  principles  of  such  modem  programming 
methodologies  as  formal  specification,  systematic  construction,  and  correctness  proving.  Common  properties  of 
these  points  of  contention  are  abstracted.  These  properties  are  then  used  to  pinpoint  possible  causes  of  the 
errors  and  to  formulate  general  guidelines  which  might  help  to  avoid  further  errors.  The  common  characteristic 
of  mathematical  rigor  and  reasoning  in  these  examples  is  noted,  leading  to  some  discussion  about  fallibility  in 
mathematics,  and  its  relationship  to  fallibility  in  these  programming  methodologies.  The  overriding  goal  is  to 
cast  a  more  realistic  perspective  on  the  methodologies,  particularly  with  respect  to  older  methodologies,  such  as 
testing,  and  to  provide  constructive  recommendations  for  their  improvements. 

[Gerh76b]  Abstract:  Backtracking  is  a  well-known  technique  for  solving  combinatorial  problems.  It  is  of  interest 
to  programming  methodologists  because  1)  correctness  of  backtracking  programs  may  be  difficult  to  ascertain 
experimentally  and  2)  efficiency  is  often  of  paramount  importance.  This  paper  applies  a  programming  methodol¬ 
ogy,  which  we  call  control  structure  abstraction,  to  the  backtracking  technique.  The  value  of  control  structure 
abstraction  in  the  context  of  correctness  is  that  proofs  of  general  properties  of  a  class  of  programs  with  similar 
control  structures  are  separated  from  proofs  of  specific  properties  of  individual  programs  of  the  class.  In  the 
context  of  efficiency,  it  provides  sufficient  conditions  for  correctness  of  an  initial  program  which  may  subse¬ 
quently  be  improved  for  efficiency  while  preserving  correctness. 

The  paper  provides  several  abstract  variations  of  backtracking  programs,  along  with  correctness  state¬ 
ments  and  assertions,  and  an  overall  parameterization  of  the  backtracking  technique  to  facilitate  selection  of  the 
appropriate  variant  abstraction  for  a  concrete  problem.  The  methodology  is  illustrated  on  the  eight  queens, 
knight’s  tour,  malicious  secretary,  and  good  sequences  problems.  Also  discussed  are  the  amount  of  work 
involved  in  the  control  structure  abstraction  approach  for  this  particular  application  area,  its  relationship  to  the 
data  structure  abstraction  method,  and  its  possible  application  to  other  areas. 

[GerbSO]  Abstract:  AFFIRM  is  an  experimental  system  for  the  algebraic  specification  and  verification  of 
abstract  data  types  and  Pascal-like  programs  using  these  types.  The  heart  of  the  system  is  a  natural  deduction 
theorem  prover  for  the  interactive  proof  of  verification  conditions  and  properties  of  data  types.  Additional 
features  include  tools  for  the  analysis  of  algebraic  specifications,  verification  of  small  programs,  the  specification 
and  partial  proof  of  a  large  file  updating  module,  and  the  proof  of  high  level  properties  of  protocols  and  security 
kernels. 

[G«rh84]  Abstract:  The  goal  of  this  paper  was  to  model  a  specification  language  and  its  analyzer  using  axiomatic 
methods  derived  from  those  applied  previously  to  abstract  data  type  and  state  transition  specifications.  The 
models  attempt  to  cover  many  interesting  features  of  PSL/PSA,  a  widely  used  specification  language  and 
analyzer  for  information  systems.  Simple  properties  expected  to  hold  for  actual  PSL/PSA  were  formalized  and 
proved  about  some  models,  with  assumptions  about  undefined  parts.  Both  model  formulation  and  property 
proofs  were  performed  within  the  AFFIRM  Specification  and  Verification  System.  The  results  show  (1)  the 
applicability  of  axiomatic  methods  for  modeling  a  new  kind  of  software  system,  (2)  insights  into  the  PSL/PSA 
class  of  specification  system,  (3)  a  possible  route  for  formal  definition  of  such  analyzers,  and  (4)  additional 
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lessons  on  the  art  of  specification,  modeling,  verification,  and  validation. 

[Gerh88a]  Abstract:  This  paper  describes  a  suite  of  tools  to  support  analysis  of  properties  of  sequences  associ¬ 
ated  with  a  specification,  with  input  or  output  to  a  program,  or  with  simple  behavioral  models  of  a  system  under 
design.  The  toolset’s  capabilities  include:  generating  sequences  to  satisfy  combinations  of  conditions,  organizing 
these  condition  combinations  as  tables  of  cases  to  serve  as  test  data,  and  visualizing  the  effects  of  executing  a 
chosen  sequence.  The  technology  base  is  Prolog  extended  with  a  powerful  window  package. 

[Gerh88b]  Position  Statements  Incinded: 

1.  Gaudel,  M-C.,  and  B.  Marre.  “Generation  of  Test  Data  from  Algebraic  Specifications.” 

2.  Wild,  C.  “Generic  Constraint  Logic  Programming  and  Incompleteness  in  the  Analysis  of  Software.” 

[Germ82a]  Abstract:  Most  high  level  languages  with  multiprocessing  do  not  have  built  in  mechanisms  to  detect 
deadlocks  during  program  execution.  We  present  transformation  rules  for  taking  an  original  Ada  program  P  and 
deriving  a  new  program  P ',  such  the  P '  has  a  potential  deadlock  if  P  does,  and  P '  signals  whenever  a  deadlock  is 
about  to  occur.  In  principle,  the  transformations  can  be  applied  mechanically,  giving  a  practical  tool  for  debug¬ 
ging  deadlocks.  Since  this  method  modifies  the  source  program,  it  can  be  used  with  any  implementation  of  the 
language,  without  special  knowledge  of  the  implementation  of  tasking.  The  transformations  that  we  have 
developed  thus  far  are  sufficient  to  handle  most  of  the  complexities  of  Ada  tasking,  including  arbitrary  task 
types,  conditional  entry  calls,  selective  waits,  timed  entry  calls,  and  intertask  exceptions. 

In  the  course  of  this  work,  we  have  developed  some  generally  useful  source  program  transformations, 
such  as  one  to  uniformly  introduce  task  identifiers.  We  have  also  developed  some  interesting  concurrent  algo¬ 
rithms  for  the  deadlock  monitoring. 

An  actual  monitor  program  for  detecting  deadlocks  has  been  implemented  in  Ada.  Our  basic  approach 
and  monitoring  algorithms  are  applicable  to  other  languages  with  multiple  processes. 

[GermM]  Abstract:  We  present  a  deadlock  monitoring  algorithm  for  Ada  tasking  programs  which  is  based  on 
transforming  the  source  program.  The  transformations  introduce  a  new  task  called  the  monitor,  which  receives 
information  from  all  other  tasks  about  their  tasking  activities.  The  monitor  detects  deadlocks  consisting  of  circu¬ 
lar  entry  calls  as  well  as  some  noncircular  blocking  situations.  The  correctness  of  the  program  transformations  is 
formulated  and  proved  using  an  operational  state  graph  model  of  tasking.  The  main  issue  in  the  correctness 
proof  is  to  show  that  the  deadlock  monitor  algorithm  works  correctly  without  having  simultaneous  information 
about  the  state  of  the  program. 

In  the  course  of  this  work,  we  have  developed  some  useful  techniques  for  programming  tasking  applica¬ 
tions,  such  as  a  method  for  uniformly  introducing  task  identifiers. 

We  argue  that  the  ease  of  finding  and  justifying  program  transformations  is  a  good  test  of  the  generality 
and  uniformity  of  a  programming  language.  The  complexity  of  the  full  Ada  language  makes  it  difficult  to  safely  | 
apply  transformational  methods  to  arbitrary  programs.  We  discuss  several  problems  with  the  current  semantics 
of  Ada’s  tasks. 

I 

[Getz83]  Abstract:  A  very  high-level  trace  for  data  structures  is  one  which  displays  a  data  structure  in  the  shape  j 
in  which  the  user  conceptualizes  it,  be  it  a  tree,  an  array,  or  a  graph.  GRAPHTRACE  is  a  system  that  facilitates 
the  very  high-level  graphic  display  of  interrelationships  among  dynamically  allocated  Pascal  records.  It  offers  the 
user  a  wide  range  of  options  to  enable  him  to  “see”  the  data  structures  on  a  graphics  screen  in  a  format  as  close 
as  possible  to  that  in  which  he  visualizes  it,  thereby  providing  a  useful  display  capability  when  the  user’s  concep¬ 
tual  model  is  a  directed  graph  or  tree. 

The  system  is  interactive,  allowing  the  user  to  refine  his  plotting  instructions  stepwise.  He  may  specify  dif¬ 
ferent  combinations  of  pointer  directions,  omit  certain  records,  and  select  a  root  record  if  desired.  As  an  addi¬ 
tional  diagnostic  aid,  the  user  may  dump  the  contents  of  specified  records  in  a  format  as  close  as  possible  to  the 
original  source  code  in  which  they  were  defined. 

The  system  is  written  in  Pascal.  It  consists  of  a  precompiler  as  well  as  various  coordinate  assignment  and 
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plotting  routines  which  provide  for  selective  display  of  the  user’s  data  structures. 

The  issue  of  portability  of  the  system  is  discussed  in  detail. 

[Gibs89]  Abstract:  An  experiment  is  designed  to  investigate  the  relationship  between  system  structure  and 
maintainability.  An  old,  ill-structured  system  is  improved  in  two  sequential  stages,  yielding  three  system  versions 
for  the  study.  The  primary  objectives  of  the  research  are  to  determine  how  or  whether  the  differences  in  the  sys¬ 
tems  influence  maintenance  performance;  whether  the  differences  are  discernible  to  programmers;  and  whether 
the  differences  are  measurable.  Experienced  programmers  perform  a  portfolio  of  maintenance  tasks  on  the  sys¬ 
tems.  Results  indicate  that  system  improvements  lead  to  decreased  total  maintenance  time  and  decreased  fre¬ 
quency  of  ripple  effect  errors.  This  suggests  that  improving  old  systems  may  be  worthwhile  and  may  yield  benefits 
over  the  remaining  life  of  the  system.  System  differences  are  not  discernible  to  programmers;  apparently  pro¬ 
grammers  are  unable  to  separate  the  complexity  of  the  systems  from  the  complexity  of  the  maintenance  tasks. 
This  finding  suggests  a  need  for  further  research  on  the  efficacy  of  subjectively  based  software  metrics.  Finally, 
results  indicate  that  a  selected  set  of  automatable,  objective  complexity  metrics  reflected  both  the  improvements 
in  the  system  and  programmer  maintenance  performance.  These  metrics  appear  to  offer  potential  as  project 
management  tools. 

[Gilb79]  Abstract:  David  Gelperin  (SEN  4  2)  tries  to  define  maintainability  in  “TRW”  terms,  using  words  such 
as  changability  and  testability  which  are  defined  [for  the  author]  in  vague  and  narrow  ways.  If  we  are  going  to 
engineer  the  software,  then  [the  author]  suggests  that  our  definitions  should  (1)  encompass  a  broader  scope  of 
the  concept,  (2)  have  operationally  useful  measuring  methods,  and  (3)  relate  more  closely  to  already  accepted 
maintainability  concepts  in  the  systems  engineering  literature. 

[GilkXX]  Abstract:  It  is  shown  that  a  software  design  methodology  based  solely  on  the  identification  of  abstrac¬ 
tions  is  insufficient  for  the  engineering  of  complex  software  systems.  Performance  analysis  is  then  introduced  as 
an  important  and  necessary  tool  for  choosing  between  alternatives  during  design.  Methods  for  carrying  out  the 
necessary  analysis  are  discussed.  These  methods  are  based  on  a  state  model  of  the  computation  and  a  probabilis¬ 
tic  grammar  based  model  of  the  input.  Finally  a  brief  description  of  our  continuing  research  in  software  design  is 
presented. 

[GU188]  Abstract:  Embedded  real  time  control  systems  typically  require  the  use  of  special  debugging  environ¬ 
ments,  which  consists  of  a  special  debugging  processor  that  hosts  the  debugging  software  and  that  monitors  the 
execution  of  the  separate  target  system  via  a  special  hardware  interface.  Our  focus  is  on  extending  the  set  of  base 
debugging  features  typically  found  in  such  an  environment  to  provide  better  support  for  real  time  task  debugging 
and  to  provide  a  more  visual  graphic  display  of  program  behavior.  We  have  developed  a  system  that  provides 
such  facilities,  in  the  form  of  a  task  condition  specification  and  checking  system  and  a  multi-window  graphics 
display.  We  have  implemented  a  prototype  of  these  novel  debugging  features  that  demonstrates  how  they  work 
and  how  they  assist  in  the  debugging  process.  We  illustrate  the  features  of  our  system  here  by  providing  a  “tour” 
through  an  example  debugging  session.  We  also  comment  on  various  constraints  that  directed  the  prototype 
development  process. 

[Ginz65]  Abstract  Procedures  for  program  testing  associated  with  implementation  of  a  large  complex  real-time 
system  are  discussed  step  by  step.  The  discussion  includes  testing  both  in  a  simulated  environment  and  in  real 
time.  Final  testing  and  monitoring  of  the  system  performance  are  also  briefly  considered. 

[Girg85]  Abstract:  The  idea  of  weak  mutation  testing  is  to  construct  test  data  which  would  force  program  com¬ 
ponents  such  an  expressions  and  variable  references  to  produce  a  wrong  ‘result’  if  they  were  to  contain  certain 
types  of  error,  for  example,  off-by-a-constant  or  wrong-variable.  The  idea  of  data  flow  driven  testing  is  to  con¬ 
struct  test  data  which  forces  the  execution  of  different  interactions  between  variable  definitions  and  references  in 
a  program. 

This  paper  describes  a  tool  for  FORTRAN  77  programs  which  has  been  developed  to  help  the  user  apply 
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the  weak  mutation  and  data  flow  testing  techniques.  The  tool  instruments  a  given  source  program  and  collects  a 
program  execution  history.  It  is  then  able  to  report  on  the  completeness  of  the  test  data  with  respect  to  weak 
mutation  and  a  family  of  data  flow  path  selection  criteria.  Some  preliminary  experiments  with  use  of  the  tool  are 
described. 

[Girg86a]  Abstract:  A  system  called  FORTEST  has  been  developed  which  helps  a  user  apply  weak  mutation 
testing,  data  flow  testing  and  control  flow  testing  for  FORTRAN  77  programs.  This  paper  concentrates  on  exper¬ 
iments  which  have  been  performed  to  compare  the  ability  of  test  coverage  criteria,  monitored  by  the  FORTEST 
system,  to  aid  discovery  of  a  large  number  of  errors  seeded  into  sample  programs.  Although  overall  the  control 
flow  strategy  was  the  most  effective  method  in  discovering  errors,  it  does  not  provide  such  specific  guidance  in 
the  construction  of  test  data  as  the  other  strategies.  What  is  more,  some  errors  were  exposed  only  by  the  data 
flow  method.  Hence  it  is  argued  that  the  diverse  strategies  are  best  seen  as  complementary  rather  than  competing 
methods. 

[Glas79]  Abbreviated  Preface:  Software  reliability  has  been  a  neglected  field.  Some  emphasis  has  been  placed  in 
recent  years  on  mangement  to  achieve  reliability,  and  the  measurement  of  reliability,  but  technology  to  achieve 
reliability  has  progressed  little  in  the  same  time  period. 

That  situation  is  changing.  Software  implementors  and  purchasers  of  software,  particularly  the  Depart¬ 
ment  of  Defense,  are  beginning  to  insist  on  reliability  as  a  requirement  of  delivered  software.  This  guidebook  is  a 
survey  of  technological  and  management  techniques,  written  as  a  menu.  Each  item  in  the  menu  is  evaluated, 
examples  of  use  are  given,  and  references  are  provided  for  further  study.  Recommendations  for  achievement  of 
software  reliability  are  also  provided. 

The  guidebook  is  intended  to  be  useful  for  all  application  areas  and  sizes  of  software  projects;  special 
emphasis  is  placed  on  the  problems  of  large  projects,  such  as  those  of  military/space  applications  and  massive 
interrelated  data  bases. 

The  reader  is  expected  to  be  a  software  manager  or  technologist  or  student  who  has  a  basic  understanding 
of  what  software  is,  but  whose  knowledge  of  reliability  concepts  is  either  rudimentary  or  has  not  been  updated  to 
include  recent  developments. 

[Glas80]  Abbreviated  Introduction:  The  literature  rarely  provides  value  judgments  about  which  methodologies 
might  prove  most  effective  in  identifying  and  correcting  the  kinds  of  errors  which  practicing  professional  pro¬ 
grammers  commonly  make.  An  exception  is  the  work  of  Howden.  Here,  a  set  of  known  software  errors  is 
analyzed  against  a  set  of  methodologies  to  determine  which  methodologies  might  have  detected  which  errors. 
Out  of  this  analysis  comes  a  set  of  “enlightened”  advocacies  of  particular  techniques  which  could  have  found  a 
large  number  of  the  known  errors. 

This  paper  reports  on  an  extension  of  Howden’s  work.  Whereas  Howden  limited  his  review  to  highly 
mathematical  scientific  library  programs,  this  review  examines  real-time  software  with  a  considerably  more 
varied  set  of  logical  requirements.  Whereas  Howden  limited  his  review  to  an  intense  understanding  of  small 
number  of  errors,  this  review  took  a  less  time-consuming  look  at  a  larger  number  of  errors.  Whereas  Howden 
considered  a  small  set  of  reliability  methodologies,  this  review  considered  a  somewhat  larger  sample  of  the 
methodologies  defined  in  [an  earlier  paper  by  the  author]. 

The  broad  conclusions  of  this  paper  are  similar  to  those  of  Howden.  A  set  of  reliability  methodologies  is 
needed;  no  one  or  two  techniques  are  sufficient  to  come  close  to  guaranteeing  software  reliability.  That  set  should 
include  some  sort  of  functional  testing,  some  sort  of  structured  testing,  and  some  sort  of  static  analysis.  How¬ 
ever,  the  specific  methodologies  recommended  as  a  result  of  this  study  differ  somewhat  from  Howden’s  recom¬ 
mendations. 

[dasM]  Abstract:  Persistent  software  errors  -  those  which  are  not  discovered  until  late  in  development,  such  as 
when  the  software  becomes  operational  -  are  by  far  the  most  expensive  kind  of  error.  Via  analysis  of  software 
problem  reports,  it  is  discovered  the  predominant  number  of  persistent  errors  in  large-scale  software  efforts  are 
errors  of  omitted  logic,  that  is,  the  code  is  not  as  complex  as  required  by  the  problem  to  be  solved.  Peer  design 
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and  code  review,  desk  checking,  and  ultra-rigorous  testing  may  be  the  most  helpful  of  the  currently  available 
technologies  in  attacking  this  problem.  New  and  better  methodologies  are  needed. 

[GUg87]  Abstract:  A  new  security  testing  method  is  proposed  that  combines  the  advantages  of  both  traditional 
“black  box”  (monolithic  functional)  testing  and  “white  box”  (functional-synthesis-based)  testing.  The  new 
method  allows  significant  coverage  both  for  security  model-based  tests  and  for  individual  kernel-call  tests.  It 
eliminates  redundant  kernel  test  cases  1)  by  using  a  variant  of  control  synthesis  graphs,  2)  by  analyzing  dependen¬ 
cies  between  descriptive  kernel-call  specifications,  and  3)  by  exploiting  access  check  separability.  A  higher 
degree  of  test  assurance  is  achieved  than  that  of  other  security  testing  methods  because  the  new  method  helps 
eliminate  cyclic  dependencies  among  test  programs  for  different  kernel  calls.  The  application  of  this  method  to 
the  testing  of  the  Secure  Xenix  kernel  is  illustrated. 

[Goel79]  Abstract:  This  paper  presents  a  stochastic  model  for  the  software  failure  phenomenon  based  on  a 
nonhomogeneous  poisson  process  (NHPP).  The  failure  process  is  analyzed  to  develop  a  suitable  mean-value 
function  for  the  NHPP;  expressions  are  given  for  several  performance  measures.  Actual  software  failure  data  are 
analyzed  and  compared  with  a  previous  analysis. 

[Goel80c]  Abstract:  In  March  1978,  Schick  and  Wolverton  published  a  paper  [Schi78]  in  the  IEEE  Transactions 
on  Software  Engineering.  Moranda  criticized  several  aspects  of  this  paper.  His  critique  was  reviewed  by  Little- 
wood  and  rebutted  by  Schick  and  Wolverton.  The  purpose  of  this  note  is  to  summarize  and  comment  on  the  main 
points  raised  in  these  discussions. 

[Goel81]  Abbreviated  Introduction:  During  the  last  decade,  numerous  studies  have  been  undertaken  to  quantify 
the  failure  process  of  large  scale  software  systems.  An  important  objective  of  these  studies  is  to  predict  software 
performance  and  use  the  information  for  decision  making.  An  important  decision  of  practical  concern  is  the 
determination  of  the  amount  of  time  that  should  be  spent  in  testing.  This  decision  of  course  will  depend  on  the 
model  used  for  describing  the  failure  phenomenon  and  the  criterion  used  for  determining  system  readiness. 

In  this  paper  we  present  a  cost  model  based  on  the  time  dependent  fault  detection  rate  model  of  Goel  and 
Okumoto  and  describe  a  policy  that  yields  the  optimal  value  of  test  time  T. 

A  brief  overview  of  the  failure  model  is  given  in  Section  2.  The  cost  model  and  the  optimal  policies  are 
described  in  Section  ^  'Die  results  are  illustrated  via  numerical  examples  in  Section  4. 

[Goel83]  Abstract:  The  purpose  of  this  guidebook  is  to  provide  state-of-the-art  information  about  the  selection 
and  use  of  existing  software  reliability  models.  Towards  this  objective,  we  have  presented  a  brief  summary  of  the 
available  models  backed  by  a  detailed  discussion  of  most  of  the  models  in  the  appendixes.  One  of  the  difficulties 
in  choosing  a  model  is  to  find  a  match  between  the  testing  environment  and  a  class  of  models.  To  help  a  user  in 
this  process,  we  have  presented  a  detailed  discussion  of  most  of  the  assumptions  that  characterize  the  various 
software  reliability  models.  The  process  of  developing  a  model  has  been  explained  in  detail  and  illustrated  via 
numerical  examples. 

[Goel85]  Abstract:  A  number  of  analytical  models  have  been  proposed  during  the  past  15  years  for  assessing  the 
reliability  of  a  software  system.  In  this  paper  we  present  an  overview  of  the  key  modeling  approaches,  provide  a 
critical  analysis  of  the  underlying  assumptions,  and  assess  the  limitations  and  applicability  of  these  models  during 
the  software  development  cycle.  We  also  propose  a  step-by-step  procedure  for  fitting  a  model  and  illustrate  it  via 
an  analysis  of  failure  data  from  a  medium-sized  real-time  command  and  control  software  system. 

[Goel88]  Abstract:  This  report  presents  the  results  of  an  experiment  investigating  the  effect  of  Fortran  and  Ada 
languages  on  program  reliability.  The  experimental  design  employed  was  a  2 2  full  factorial  design,  i.e.,  a  design 
in  two  variables,  each  of  two  levels.  The  problem  used  in  the  experiment  was  the  Launch  Interceptor  Program 
(LIP),  a  simple  but  realistic  anti-missiles  system.  Reliability  comparisons  between  Ada  and  Fortran  programs 
were  based  on  the  total  number  of  errors  as  well  as  on  errors  found  during  various  testing  phases.  Some 
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comparisons  were  also  based  on  error  density,  the  number  of  errors  per  100  non-comment  lines  of  code.  It  was 
found  that  on  the  average,  the  Ada  programs  had  about  70  percent  less  errors.  Similar  differences  were  found 
for  data  based  on  error  causes  and  error  types. 

[GoguSO]  Abbreviated  Introduction:  This  note  describes  a  certain  general  approach  to  system  design  and  verifi¬ 
cation  based  on  the  use  of  a  high  level  executable  specification  language.  We  begin  with  a  discussion  of  the  nature 
of  specification  languages,  and  give  a  rather  long  list  of  desirable  features.  We  then  discuss  the  two  SRI  design 
and  specification  projects.  SPECIAL  with  HDM,  and  the  combination  of  CLEAR,  OBJ  and  CAT  in  the  light  of 
these  requirements.  We  conclude  by  indicating  some  possible  new  directions. 

[Good70]  Abstract:  A  definition  is  given  of  computer  interval  arithmetic  suitable  for  implementation  on  a  digi¬ 
tal  computer.  Some  computational  properties  and  simplifications  are  derived.  An  ALGOL  code  segment  is 
proved  to  be  a  correct  implementation  of  the  definition  on  a  specified  machine  environment. 

[Good75a]  Abstract:  This  paper  examines  the  theoretical  and  practical  role  of  testing  in  software  development. 
We  prove  a  fundamental  theorem  showing  that  properly  structured  tests  are  capable  of  demonstrating  the 
absence  of  errors  in  a  program.  The  theorem’s  proof  hinges  on  our  definition  of  test  reliability  and  validity,  but 
its  practical  utility  hinges  on  being  able  to  show  when  a  test  is  actually  reliable.  We  explain  what  makes  tests 
unreliable  (for  example,  we  show  by  example  why  testing  all  program  statements,  predicates,  or  paths  is  not  usu¬ 
ally  sufficient  to  insure  test  reliability),  and  we  outline  a  possible  approach  to  developing  reliable  tests.  We  also 
show  how  the  analysis  required  to  define  reliable  tests  can  help  in  checking  a  program’s  design  and  specifications 
as  well  as  in  preventing  and  detecting  implementation  errors. 

[Good75c]  Abstract:  This  paper  is  an  initial  progress  report  on  the  development  of  an  interactive  system  for  ver¬ 
ifying  that  computer  programs  meet  given  formal  specifications.  The  system  is  based  on  the  conventional  induc¬ 
tive  assertion  method:  given  a  program  and  its  specifications,  the  object  is  to  generate  the  verification  conditions, 
simplify  them,  and  prove  what  remains.  The  important  feature  of  the  system  is  that  the  human  user  has  the 
opportunity  and  obligation  to  help  actively  in  the  simplifying  and  proving.  The  user,  for  example,  is  the  primary 
source  of  problem  domain  facts  and  properties  needed  in  the  proofs.  A  general  description  is  given  of  the  overall 
design  philosophy,  structure,  and  functional  components  of  the  system,  and  a  simple  sorting  program  is  used  to 
illustrate  both  the  behavior  of  major  system  components  and  the  type  of  user  interaction  the  system  provides. 

[Good75d]  Abstract:  This  paper  defines  exception  conditions,  discusses  the  requirements  exception  handling 
language  features  must  satisfy,  and  proposes  some  new  language  features  for  dealing  with  exceptions  in  an  ord¬ 
erly  and  reliable  way.  The  proposed  language  features  serve  to  highlight  exception  handling  issues  by  showing 
how  deficiencies  in  current  approaches  can  be  remedied. 

[Good75e]  Abstract:  Techniques  are  presented  for  the  design  of  computer  programs  that  are  proved  to  meet 
stated  specifications.  The  design  strategy  is  the  simultaneous  step-wise  refinement  of  both  the  program  and  its 
proof  so  that  at  each  step  the  program  constructed  so  far  is  proved.  At  each  step,  the  specifications  for  a  single 
program  unit  are  given,  the  unit  is  designed,  and  then  proved,  by  automatically  supportable  methods,  before 
going  on  to  successive  steps.  The  proof  i)  shows  that  the  program  unit  meets  its  specifications,  ii)  exhibits  any 
assumptions  the  unit  makes  about  the  problem  domain,  and  iii)  defines  the  specifications  for  units  to  be  designed 
in  later  steps.  The  design  process  is  based  on  the  refinement  of  operational  and  data  abstractions  in  both  the  pro¬ 
gram  and  its  specifications.  These  abstractions  are  what  allow  the  proof  at  each  step  to  be  supported  by 
automatic,  or  interactive,  program  proving  systems.  The  abstractions  also  keep  the  proofs  of  the  individual  units 
at  an  appropriate  level  of  abstraction  and  also  largely  independent,  thus  significantly  reducing  the  size  of  the 
complete  proof  of  the  entire  program.  These  techniques  of  provable  programming  are  illustrated  by  two  exam¬ 
ples. 

[Good79a]  Abbreviated  Introduction:  Testing  is  the  principal  method  of  deciding  whether  a  program  is  ready  for 
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operational  use.  In  this  paper,  [the  author]  will  examine  various  testing  approaches.  The  purpose  behind  this 
examination  is 

1.  to  summarize  what  is  known  today  about  testing  principles  and  practices,  alerting  software  developers  to 
shortcomings  and  advantages  of  some  methods  under  development; 

2.  to  stimulate  productive  research  on  testing  methodology  by  identifying  areas  where  further  work  is  needed; 
and 

3.  to  provide  a  framework  within  which  testing  techniques  can  be  identified,  evaluated,  and  improved. 

[Gord76]  Abstract:  Recent  discoveries  in  the  area  of  Algorithm  Structure  or  Software  Physics  have  produced  a 
number  of  hypotheses.  One  of  these  relates  the  number  of  elementary  mental  discriminations  required  to  imple¬ 
ment  an  algorithm  to  measurable  properties  of  that  algorithm,  and  the  results  of  one  set  of  experiments  confirm¬ 
ing  this  relationship  have  been  published.  That  publication,  while  significant,  made  no  claim  to  finality,  suggest¬ 
ing  instead  that  further  experiments  were  warranted.  This  paper  will  present  the  results  of  a  second  set  of  experi¬ 
ments,  having  the  advantages  of  being  conducted  in  a  single  implementation  language,  Fortran,  from  problem 
specifications  readily  available  in  computer  textbooks. 

The  first  section  of  this  paper  presents  the  timing  hypothesis,  and  the  elementary  equations  upon  which  it 
rests.  The  second  section  presents  the  details  of  the  experiment  and  the  results  which  were  obtained,  and  the 
third  section  contains  an  analysis  of  the  data. 

[Gord79a]  Abstract:  The  sharply  rising  cost  incurred  during  the  production  of  quality  software  has  brought  with 
it  the  need  for  the  development  of  new  techniques  of  software  measurement.  In  particular,  the  ability  to  objec¬ 
tively  assess  the  clarity  of  a  program  is  essential  in  order  to  rationally  develop  useful  engineering  guidelines  for 
efficient  software  production  and  language  development. 

A  functional  relation  between  the  clarity  of  a  program  and  the  number  and  frequency  of  operators  and 
operands  which  occur  in  the  program  is  presented.  This  measure  of  program  clarity  provides  an  estimate  of  the 
amount  of  mental  effort  required  to  understand  the  program,  assuming  that  the  reader  is  fluent  in  the  program¬ 
ming  language  employed. 

This  measure  is  tested  by  applying  it  to  several  published  examples  which  demonstrate  improvements  in 
program  clarity.  The  objective  assessment  which  is  provided  using  this  measure  is  found  to  agree  with  the  experi¬ 
mental  data  gathered. 

[Gord79b]  Abstract:  Several  measures  of  program  clarity  have  been  proposed  which  attempt  to  assess  the  clarity 
of  a  program  as  a  function  of  easily  measured  properties  of  the  code.  Such  measures  include  the  number  of  vari¬ 
ables  or  statements,  or  the  density  of  go  to’s. 

The  measure  of  program  clarity,  developed  in  the  field  of  software  science,  equates  the  amount  of  mental 
effort  required  to  understand  a  program  with  the  ratio  of  program  volume  to  implementation  level.  To  be  effec¬ 
tive,  a  measure  such  as  this  should  reflect  the  improvement  in  clarity  which  occurs  when  program  transforma¬ 
tions  which  make  software  easier  to  understand  are  applied. 

The  removal  of  each  of  six  impurity  classes  from  poorly  written  programs  is  studied.  For  a  wide  class  of 
programs,  purification  reduces  the  amount  of  effort  required  for  comprehension  as  predicted  by  the  measure. 

[Gord85a]  Abstract:  The  purpose  of  this  technical  note  is  to  formulate  a  framework  for  the  evaluation  of 
software  metrics  and  to  present  preliminary  results  toward  that  formulation.  This  framework  is  in  support  of 
DOD’s  Software  Technology  for  Adaptable  Reliable  Systems  (STARS)  program  whose  principle  software- 
related  goals  are: 

1.  Improve  productivity  (up  to  tenfold). 

2.  Improve  quality  (maintainability,  enhancability,  correctness,  efficiency)  and  reliability. 

3.  Increase  portability. 

4.  Promote  development  and  application  of  reusable  software. 

5.  Reduce  time  and  cost  to  develop  defense  software. 

The  STARS  program  will  achieve  these  goals  by  conducting  research  and  development  on  integrated 
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Software  Engineering  Environments  (SEEs)  and  then  utilize  them.  An  integrated  SEE  will  provide  automated 
tools  for  carrying  out  the  task  of  various  software  life  cycle  phases  such  as  requirements  analysis,  design,  coding, 
module  testing,  integration  testing,  and  maintenance. 

METRE’S  role  is  that  of  system  research  and  analysis  in  determining  and  developing  techniques  to  meas¬ 
ure  and  evaluate  the  progress  made  in  achieving  the  aforementioned  goals.  The  remainder  of  this  technical  note 
presents  the  initial  results  of  MTTRE’s  review  of  software  metrics  and  suggestions  for  future  activities. 

[Gord86]  Abstract:  In  a  distributed  environment  events  occur  concurrently  on  different  processors.  The  order 
in  which  events  occur  cannot  be  easily  determined;  a  program  that  works  correctly  one  time  may  fail  subse¬ 
quently  if  the  timing  between  processors  changes.  For  this  research,  we  have  investigated  distributed  program¬ 
ming  bugs  that  depend  on  the  relative  order  between  events.  We  describe  a  tool  (called  TAP)  to  aid  the  program¬ 
mer  in  discovering  the  causes  of  timing  errors  in  running  programs,  describe  experiments  using  TAP,  and  report 
the  impact  TAP’s  history-keeping  mechanism  has  on  the  running  time  of  various  distributed  programs.  We  also 
show  that  TAP  is  useful  in  finding  other  types  of  distributed  program  bugs. 

[Gord88]  Abstract:  In  a  distributed  environment,  events  occur  concurrently  on  different  processors.  The  order 
in  which  events  occur  cannot  be  easily  determined;  a  program  that  works  correctly  one  time  may  fail  subse¬ 
quently  if  the  timing  between  processors  changes.  For  this  research,  we  have  investigated  distributed  program 
bugs  that  depend  on  the  relative  order  between  events.  We  describe  a  tool  (called  TAP)  to  aid  the  programmer  in 
discovering  the  causes  of  timing  errors  in  running  programs.  TAP,  a  tool  similar  to  a  postmortem  debugger,  uses 
the  history  of  interprocess  communication  to  construct  a  timing  graph,  a  directed  graph  where  an  edge  joins 
node  x  to  node  y  if  event  x  directly  precedes  event  y  in  time.  The  programmer  can  then  use  TAP  to  look  at  the 
graph  to  find  the  events  that  occurred  in  an  unacceptable  order. 

Because  of  the  nondeterministic  nature  of  distributed  programs,  we  feel  a  history-keeping  mechanism 
must  always  be  active  so  that  bugs  can  be  dealt  with  as  they  occur.  Our  goal  is  to  collect  enough  information  at 
run  time  to  construct  the  timing  graph  if  needed.  Since  it  is  always  active,  this  mechanism  must  be  efficient. 

We  also  describe  experiments  run  using  TAP  and  report  the  impact  that  TAP’s  history-keeping  mechanism  j 
has  on  the  running  time  of  various  distributed  programs. 

[Gorl87]  Abstract:  Mockingbird  is  a  testing  methodology  founded  on  a  formal  specification  of  the  test  space. 
The  specifications  are  executable  and  bidirectional.  When  run  in  one  direction  they  act  as  generators,  producing 
tests  whose  properties  conform  to  the  specification.  When  rim  in  the  opposite  direction  they  act  as  acceptors, 
validating  tests  against  the  specification.  The  specification  language  is  a  combination  of  context-free  grammars  I 
and  constraint  systems.  The  semantics  of  the  specification  are  based  on  Constraint  Logic  Programming.  This 
paper  describes  the  philosophy,  design  and  implementation  of  Mockingbird  and  its  use  in  testing  a  large,  com¬ 
plex  system. 

[Gors80]  Abbreviated  Abstract:  In  this  investigation  of  the  factors  that  contribute  to  program  simplicity  and 
understandability  (and  thus  modifiability),  we  are  considering  a  program  to  be  structured  if  the  control  structure 
can  be  expressed  via  a  Nassi-Shneiderman  Diagram  which  is  equivalent  to  being  a  Structured  Program  in  the 
scheme  of  Linger,  Mills  and  Witt.  We  assert  that  it  is  insufficient  to  merely  identify  these  factors.  We  further 
assert  that  it  also  is  insufficient  to  present  subjective  measures  of  these  factors  in  that  subjective  measurements 
reduce  to  opinions  and  are  not  measurements  at  all.  If  we  are  satisfied  with  subjective  measurements  of  com¬ 
plexity,  then  we  are  satisfied  with  opinions  of  complexity  and  are  also  complacently  satisfied  with  programming  as 
an  “art.”  A  science  of  programming  necessarily  implies  the  ability  to  objectively  measure  on  a  quantitative  scale 
the  important  characteristics  of  a  program  or  of  a  system  of  programs. 

A  system  of  programs  considered  at  the  source  statement  level  possesses  a  structure  directly  derived  from 
the  top-down  analysis  and  module  definition.  Although  this  macro-structure-the  inter-module  structure-may  be 
relatively  simple  and  hierarchical,  it  may  be  complex.  What  does  the  word  “complex”  mean?  Can  we  objec¬ 
tively  measure  the  simplicity/complexity  factor  in  a  quantitative  sense  at  the  inter-module  level? 
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[Goul74]  Abstract:  This  experiment  represents  a  new  approach  to  the  study  of  the  psychology  of  programming, 
and  demonstrates  the  feasibility  of  studying  an  isolated  part  of  the  programming  process  in  the  laboratory.  Thirty 
experienced  FORTRAN  programmers  debugged  12  one-page  FORTRAN  listings,  each  of  which  was  syntacti¬ 
cally  correct  but  contained  one  non-syntactic  error  (bug).  Three  classes  of  bugs  (Array  bugs,  Iteration  bugs,  and 
bugs  in  Assignment  Statements)  in  each  of  four  different  programs  were  debugged.  The  programmers  were 
divided  into  five  groups,  based  upon  the  information  or  debugging  “aids,”  given  them.  Key  results  were  that 
debug  times  were  short  (median  6  min.).  The  aids  groups  did  not  debug  faster  than  the  control  group;  program¬ 
mers  adopted  their  debugging  strategies  based  upon  the  information  available  to  them.  The  results  suggested  that 
programmers  often  identify  the  intended  state  of  a  program  before  they  find  the  bug.  Assignment  bugs  were  more 
difficult  to  find  than  Array  and  Iteration  bugs,  probably  because  the  latter  could  be  detected  from  a  high-level 
understanding  of  the  programming  language  itself.  Debugging  was  at  least  twice  as  efficient  the  second  time  pro¬ 
grammers  debugged  a  program  (though  with  a  different  bug  in  it).  A  simple  hierarchical  description  of  debugging 
was  suggested,  and  some  possible  “principles”  of  debugging  were  identified. 

[Gourfil]  Abbreviated  Introduction:  It  is  the  purpose  of  this  thesis  to  describe  a  new  theoretical  framework  for 
testing  that  [provides  a  more  useful  criterion  for  judging  testing  than  whether  or  not  it  is  capable  of  verification, 
provides  a  way  of  comparing  methods  of  testing  with  each  other,  addresses  program  reliability,  and  generalizes 
previous  work  on  the  subject].  Chapter  II  develops  the  framework,  including  definitions  that  relate  verification 
and  testing  and  the  relative  powers  of  methods  of  testing.  Chapter  m  shows  how  previous  work,  both  theoretical 
and  applied  fits  into  this  framework.  Important  theoretical  works  are  shown  to  be  special  cases  of  the  new 
framework  and  the  framework  is  shown  to  satisfy  some  previously  articulated  needs.  Many  of  the  common  con¬ 
ceptions  about  the  power  of  various  practical  testing  methods  are  confirmed,  but  one,  the  implication  of  muta¬ 
tion  testing’s  “competent  programmer  hypothesis,”  is  shown  to  be  false.  One  of  the  conceptions  confirmed  by 
the  framework  is  the  importance  of  finding  greater  use  for  specifications  of  programs  in  the  generation  of  test 
data.  Prior  work  on  this  subject  is  outlined  and  then  Chapter  IV  presents  in  detail  a  new  method  for  generating 
test  data  from  formal  specifications. 

[Gour83]  Abstract:  Testing  has  long  been  in  need  of  mathematical  underpinnings  to  explain  its  value  as  well  as 
its  limitations.  This  paper  develops  and  applies  a  mathematical  framework  that  1)  unifies  previous  work  on  the 
subject,  2)  provides  a  mechanism  for  comparing  the  power  of  methods  of  testing  programs  based  on  the  degree 
to  which  the  methods  approximate  program  verification,  and  3)  provides  a  reasonable  and  useful  interpretation 
of  the  notion  that  successful  tests  increase  one’s  confidence  in  the  program’s  correctness. 

Applications  of  the  framework  include  confirmation  of  a  number  of  common  assumptions  about  practical 
testing  methods.  Among  the  assumptions  confirmed  is  the  need  for  generating  tests  from  specifications  as  well  as 
programs.  On  the  other  hand,  a  careful  formal  analysis  of  the  usual  assumptions  surrounding  mutation  analysis 
shows  that  the  “competent  programmer  hypothesis”  does  not  suffice  to  ensure  the  claimed  high  reliability  of 
mutation  testing.  Hardware  testing  is  shown  to  fit  into  the  framework  as  well,  and  a  brief  consideration  of  its 
shows  how  the  practical  differences  between  it  and  software  testing  arise. 

[Grad87a]  Introduction:  Many  organizations  responsible  for  the  evolution  of  software  systems  seem  to  operate 
constantly  in  a  reactive  mode,  fighting  the  flames  of  the  most  recent  fire.  Behind  the  visible  sense  of  urgency, 
though,  three  primary  strategic  elements  appear  to  control  the  actions  of  managers: 

•  minimizing  defects, 

•  minimizing  engineering  effort  and  schedule,  and 

•  maximizing  customer  satisfaction. 

In  a  broad  sense,  the  ultimate  objective  of  all  three  approaches  is  customer  satisfaction.  This  article  specifically 
discusses  their  relationships  to  the  maintenance  of  delivered  software. 

[Gran84]  Abstract:  Considerable  resources  are  devoted  to  the  maintenance  of  programs  including  that  required 
to  correct  errors  not  discovered  until  after  the  programs  are  delivered  to  the  user.  A  number  of  factors  are 
believed  to  affect  the  occurrence  of  these  errors,  e.g.,  the  complexity  of  the  programs,  the  intensity  with  which 
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programs  are  used,  and  the  programming  style.  Several  hundred  programs  making  up  a  manufacturing  support 
system  are  analyzed  to  study  the  relationships  between  the  number  of  delivered  errors  and  measures  of  the  pro¬ 
grams’  size  and  complexity  (particularly  as  measured  by  software  science  metrics),  frequency  of  use,  and  age. 
Not  surprisingly,  program  size  is  found  to  be  the  best  predictor  of  repair  maintenance  requirements.  Repair 
maintenance  is  more  highly  correlated  with  the  number  of  lines  of  source  code  in  the  program  than  it  is  to 
software  science  metrics,  which  is  surprising  in  light  of  previously  reported  results.  Actual  error  rate  is  found  to 
be  much  higher  than  that  which  would  be  predicted  from  program  characteristics. 

[Grie76]  Abstract:  The  ideas  behind  proofs  for  programs  are  outlined,  and  conventional  definitions  of  assign¬ 
ment,  etc.,  are  given.  The  main  part  of  this  paper  is  the  idealized  development  of  nontrivial  program  in  a  discip¬ 
lined  fashion.  The  use  of  Dijkstra’s  “calculus”  for  formal  development  of  programs  as  a  guide  to  structured  pro¬ 
gram  development  is  discussed  in  relation  to  the  example  presented. 

[Gri«77]  Abstract:  A  parallel  program,  Dijkstra’s  on-the-fly  garbage  collector,  is  proved  correct  using  a  proof 
method  developed  by  Owicki.  The  fine  degree  of  interleaving  in  this  program  makes  it  especially  difficult  to 
understand,  and  complicates  the  proof  greatly.  Difficulties  with  proving  such  parallel  programs  correct  are  dis¬ 
cussed. 

[Grie79]  Abbreviated  Abstract:  The  most  prevalent  approach  to  proving  that  a  program  satisfies  a  given  pro¬ 
perty  has  been  the  invariant-assertion  method.  Invariant  assertions  are  supplied  to  express  relationships  between 
the  different  program  variables  and  are  attached  to  specific  program  points  with  the  understanding  that  the  asser¬ 
tion  is  to  hold  every  time  control  passes  through  the  points.  Assuming  that  the  assertion  attached  to  the  program 
entrance  (input  specification)  holds,  partial  correctness  is  established  if  we  can  prove  that  the  assertion  attached 
to  the  program  exit  (output  specification)  holds  whenever  control  reaches  the  exit.  A  completely  different 
method,  typically  the  well-founded  set  method,  is  applied  to  prove  program  termination,  i.e.,  to  prove  that  if  the 
input  specification  holds,  then  control  will  eventually  reach  the  exit.  The  two  proofs  establish  the  total  correct¬ 
ness  of  the  program.  The  intermittent-assertion  method,  originally  introduced  by  R.M.  Bur  stall,  allows  one  to 
establish  total  correctness  by  a  single  proof.  This  method  again  involves  affixing  assertions  to  points  in  the  pro¬ 
gram  with  the  intention  that,  at  least  once,  control  will  pass  through  the  point  with  the  assertion  satisfied  by  the 
current  variable  values. 

The  authors  first  present  and  illustrate  the  intermittent-assertion  method  by  a  variety  of  examples  selected 
to  illustrate  different  aspects  of  total  correctness  are  markedly  simpler  than  any  known  conventional  counter¬ 
parts.  Then  the  authors  show  how  proofs  by  conventional  methods  may  be  translated  into  intermittent-assertion 
proofs.  They  effectively  show  that  the  translation  process  is  purely  mathematical  and  does  not  increase  the  com¬ 
plexity  of  the  proof.  Finally,  they  present  two  applications  of  the  intermittent-assertion  method.  The  intermit¬ 
tent-assertion  method  is  employed  to  establish  the  validity  of  the  transformation  of  a  recursive  program  into  an 
equivalent  iterative  one.  The  second  application  is  concerned  with  the  correctness  proof  of  continuously  operat¬ 
ing  programs. 

[Gma80a]  Abstract:  In  the  paper  a  comparison  of  processing  time  and  reliability  performance  for  the  Recovery 
Blocks  scheme  and  N- Version  Programming  technique  is  presented.  Derived  queuing  models  can  be  useful  in 
deciding  which  of  the  strategies  should  be  used,  depending  on  system  parameters. 

[Grov80]  Abstract:  An  implementation  of  Ada  should  be  based  on  a  machine-independent  translator  generating 
code  for  a  Virtual  Machine,  which  can  be  realized  on  a  variety  of  machines.  This  approach,  which  leads  to  a 
high  degree  of  compiler  portability,  has  been  very  successful  in  a  number  of  recent  language  implementation  pro¬ 
jects  and  is  the  approach  which  has  been  specified  by  the  U.S.  Army  and  Air  Force  in  their  requirements  for 
Ada  implementations. 

This  paper  discusses  the  rationale,  requirements  and  design  of  such  a  Virtual  Machine  for  Ada.  The  dis¬ 
cussion  concentrates  on  a  number  of  fundamental  areas  in  which  problems  arise:  basic  Virtual  Machine  struc¬ 
ture,  including  storage  structure  and  addressing;  data  storage  and  manipulation;  flow  of  control;  subprograms, 
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blocks  and  exceptions;  and  task  handling. 

[Guin87]  Abstract:  Program  Mutation  is  a  testing  methodology  that  provides  quantitative  information  on  the 
status  of  software  development.  Mothra  is  a  testing  environment  that  uses  Program  Mutation  as  its  underlying 
methodology.  It  consists  of  an  integrated  set  of  tools  and  interfaces  that  allow  the  user  to  interactively  test  a 
software  system  written  in  Fortran-77  throughout  the  software  development  cycle.  It  is  currently  being  run  under 
UNIX  4.X  BSD  and  Ultrix  V1.2. 

This  document  is  primarily  a  users  manual  for  the  first  time  users  of  Mothra,  although  it  also  intends  to 
serve  as  a  reference  manual  for  the  more  experienced  user.  The  first  section  gives  introductory  background  and 
tries  to  explain  the  functionality  given  by  the  Mothra  testing  environment.  It  should  only  be  read  by  those  with  lit¬ 
tle  knowledge  of  mutation  testing,  and  those  wishing  more  detailed  information  should  consult  the  bibliography. 
In  the  second  section  the  different  user  interfaces  to  Mothra  are  explored  and  examples  of  software  testing  are 
developed.  A  user  wanting  questions  answered  about  the  specifics  of  an  interface  should  consult  the  section 
relating  to  that  specific  interface. 

[Gutt77]  Abstract:  Abstract  data  types  can  play  a  significant  role  in  the  development  of  software  that  is  reliable, 
efficient,  and  flexible.  This  paper  presents  and  discusses  the  application  of  an  algebraic  technique  for  the  specifi¬ 
cation  of  abstract  data  types.  Among  the  examples  presented  is  a  top-down  development  of  a  symbol  table  for  a 
block  structured  language;  a  discussion  of  the  proof  of  its  correctness  is  given.  The  paper  also  contains  a  brief 
discussion  of  the  problems  involved  in  constructing  algebraic  specifications  that  are  both  consistent  and  com¬ 
plete. 

[Gutt78a]  Abstract:  A  data  abstraction  can  be  naturally  specified  using  algebraic  axioms.  The  virtue  of  these 
axioms  is  that  they  permit  a  representation-independent  formal  specification  of  a  data  type.  An  example  is  given 
which  shows  how  to  employ  algebraic  axioms  at  successive  levels  of  implementation.  The  major  thrust  of  the 
paper  is  twofold.  First,  is  is  shown  how  the  use  of  algebraic  axiomatizations  can  simplily  the  process  of  proving 
the  correctness  of  an  implementation  of  an  abstract  data  type.  Second,  semi-automatic  tools  are  described 
which  can  be  used  both  to  automate  such  proofs  of  correctness  and  to  derive  an  immediate  implementation  from 
the  axioms.  This  implementation  allows  for  limited  testing  of  programs  at  design  time,  before  a  conventional 
implementation  is  accomplished. 

[Gutt78b]  Summary:  There  have  been  many  recent  proposals  for  embedding  abstract  data  types  in  programming 
languages.  In  order  to  reason  about  programs  using  abstract  data  types,  it  is  desirable  to  specify  their  properties 
at  an  abstract  level,  independent  of  any  particular  implementation.  This  paper  presents  an  algebraic  technique 
for  such  specifications,  develops  some  of  the  formal  properties  of  the  technique,  and  show  that  these  provide 
useful  guidelines  for  the  construction  of  adequate  specifications. 

[Gutt80]  Abstract:  The  formulation  and  analysis  of  a  design  specification  is  almost  always  of  more  utility  than 
the  verification  of  the  consistency  of  a  program  with  its  specification.  Good  specification  tools  can  assist  in  the 
process,  but  have  generally  not  been  proposed  and  evaluated  in  this  light.  In  this  paper  we  outline  a  specification 
language  combining  algebraic  axioms  and  predicate  transformers,  present  part  of  a  non-trivial  example  (the 
specification  of  a  high-level  interface  to  a  display),  and  finally  discuss  the  analysis  of  this  specification. 

[Hall80]  Abstract:  This  is  a  highly  non-technical  discussion  of  nine  concepts  basic  to  data  processing  security. 
The  concepts  are:  DP  RESOURCES,  THREAT,  VULNERABILITY,  EXPOSURE,  ADVERSE  EVENT, 
LIKELIHOOD,  RISK,  CONTROL,  and  RISK  MANAGEM’  NT.  A  good  grasp  of  these  concepts  and  their 
inter-relationships  is  key  to  understanding  this  relatively  new  an  ill  decidely  undisciplined  discipline. 

[Hall86]  Abbreviated  Abstract:  Cloze  tests  (i.e.,  fill-in-missing-parts  tests)  have  been  a  long-standing  measure  of 
prose  comprehension.  They  seem  to  offer  software  engineers  several  theoretical  and  practical  advantages  over 
multiple-choice  comprehension  quizzes,  the  most  common  software  comprehension  measurement  tool.  Through 
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human-subject  experimentation,  evidence  was  gathered  to  support  the  practical  advantages  of  using  the  cloze 
procedure  for  measuring  software  comprehension.  Cloze  tests  were  found  to  be  easy  to  construct,  administer, 
and  score  and  capable  of  discriminating  between  programs  of  varying  comprehensibility.  However,  discrepancies 
between  multiple-choice  comprehension  quiz  results  and  some  doze  tests  results  for  the  same  software  suggested 
that  certain  forms  of  software  cloze  tests  may  not  be  valid.  A  model  of  software  cloze  tests  was  developed  to 
identify  a  software  cloze  test  characteristic  that  may  produce  invalid  results.  The  test  characteristic  was  con-  i 
cemed  with  the  relative  proportion  of  “program-dependent”  and  “program-independent”  cloze  items  within  a 
test.  The  developed  model  was  shown  to  be  consistent  with  software  cloze  test  results  of  another  researcher  and 
led  to  suggestions  for  improving  software  cloze  testing. 

[Halp87]  Abstract:  Muse  is  a  verification  system  which  extends  the  collection  of  tools  developed  by  SRI  Inter¬ 
national  for  their  Hierarchical  Development  Methodology  (HDM).  It  enhances  the  SRI  system  by  providing  a 
capability  for  proving  invariants  and  constraints  for  the  state  machine  described  by  a  specification  written  in 
SPECIAL  (the  specification  language  of  HDM).  In  particular,  it  enables  one  to  use  the  HDM  system  to  meet  the  j 
requirements  for  formal  verification  in  a  National  Computer  Security  Center  A1  evaluation  of  a  secure  operating 
system.  In  addition  to  the  tools  provided  by  SRI,  Muse  has  a  parser,  a  facility  to  handle  multiple  modules,  a  for¬ 
mula  generator,  and  a  theorem  prover.  The  theorem  prover  has  a  number  of  interesting  features  designed  to  facil-  j 
itate  human  direction  of  the  proving  process.  In  concept,  it  is  open-ended.  We  introduce  the  notion  of  a  theorem 
prover  kernel  as  a  device  for  ensuring  the  logical  soundness  of  the  prover  in  the  face  of  continual  improvements 
to  its  functionality. 

[Hals73b]  Abstract:  A  technique  for  measuring  simple  structural  properties  of  algorithms  is  described.  Using 
these  measures,  it  is  found  that  for  a  nontrivial  class  of  algorithms  there  is  a  quantitative  relationship  between 
operators  and  operands  and  their  usage.  Properities  of  “Full”  and  “Reduced”  algorithms  are  then  explored,  and 
shown  to  predict  the  quantitative  relationship  observed. 

[Hals77a]  Abstract:  This  book  contains  the  first  systematic  summarization  of  a  branch  of  experimental  and 
theoretical  science  dealing  with  the  human  preparation  of  computer  programs  and  other  types  of  written 
material.  Application  of  the  classical  methods  of  the  natural  sciences  demonstrates  that  even  such  relatively 
intangible  objects  as  written  abstracts  and  computer  programs  are  governed  by  natural  laws,  both  in  their 
preparation  and  in  their  ultimate  form. 

The  work  underlying  each  chapter  of  this  monograph  is  firmly  based  on  the  methods  and  principles  of 
classical  experimental  science.  Even  so,  the  results  in  this  area,  or  more  specifically,  the  concept  that  significant 
quantitative  results  are  attainable  in  such  an  area,  are  sufficiently  counterintuitive  as  to  appear  almost  weird. 

Intuition,  however,  is  far  from  trustworthy,  as  demonstrated  when  that  ancient  scientist  dropped  the  wood 
and  lead  balls  from  the  tower  of  Pisa.  As  he  held  the  balls  over  the  edge  of  the  tower,  surely  the  much  greater  pull 
on  the  hand  holding  the  lead  ball  should  have  convinced  him  that  the  experiment  was  unnecessary;  that  no  two 
bodies  would  behave  the  same.  Even  today,  watching  a  feather  and  a  lead  shot  fall  through  a  vacuum  is  fascinat¬ 
ing,  because  it  is  still  “unexpected”  or  counterintuitive. 

Perhaps  it  is  this  same  sense  of  the  unexpected  that  has  fascinated  those  of  use  who  are  working  in  this 
new  area  now  called  software  science.  The  first  experimental  results  were  obtained  nearly  five  years  ago;  since 
that  time  the  methods  have  been  refined  and  extended  in  many  unanticipated  directions,  but  in  each  case  further 
investigation  has  increased  rather  than  limited  confidence  in  the  results. 

[Hame82]  Abstract:  Karl  Popper  has  described  the  scientific  method  as  “the  method  of  bold  conjectures  and 
ingenious  and  severe  attempts  to  refute  them.”  Software  Science  has  made  bold  conjectures  in  postulating 
specific  relationships  between  various  ‘metrics’  of  software  code  and  in  ascribing  psychological  interpretations  to 
some  of  these  metrics. 

[Haml77a]  Abstract:  If  finite  input-output  specifications  are  added  to  fhe  syntax  of  programs,  these  specifica¬ 
tions  can  be  verified  at  compile  time.  Programs  which  carry  adequate  tests  with  them  in  this  way  should  be 
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resistant  to  maintenance  errors.  If  the  specifications  are  independent  of  program  details  they  are  easy  to  give, 
and  unlikely  to  contain  errors  in  common  with  the  program.  Furthermore,  certain  finite  specifications  are  maxi¬ 
mal  in  that  they  exercise  the  control  and  expression  structure  of  a  program  as  well  as  any  tests  can. 

A  testing  system  based  on  a  compiler  is  described,  in  which  compiled  code  is  utilized  under  interactive 
control,  but  “semantic”  errors  are  reported  in  the  style  of  conventional  syntax  errors.  The  implementation  is 
entirely  in  the  high-level  language  on  which  the  system  is  based,  using  some  novel  ideas  for  improving  documen¬ 
tation  without  sacrificing  efficiency. 

[Haml77b]  Abstract:  The  techniques  of  compiler  optimisation  can  be  applied  to  aid  a  programmer  in  writing  a 
program  which  cannot  be  improved  by  these  techniques.  A  finite,  representative  set  of  test  data  can  be  useful  in 
this  process.  This  paper  presents  the  theoretical  basis  for  the  (nonconstructive)  existence  of  test  sets  which  serve 
as  maximally  effective  standins  for  a  unlimited  number  of  input  possibilities.  It  is  argued  that  although  the  time 
required  by  a  compiler  to  fully  exercise  a  program  on  a  set  of  data  may  be  large,  The  corresponding  improvement 
in  the  reliability  of  the  program  may  also  be  large  if  the  set  meets  the  given  theoretical  requirements. 

[Haml78a]  Abstract:  A  theory  of  program  testing  is  presented,  based  on  the  idea  of  “reliable  test  set.”  Intui¬ 
tively,  a  test  is  reliable  if  it  exposes  all  errors  that  any  test  could  find.  To  obtain  a  practical  theory,  two  alterations 
of  this  idea  are  suggested:  (1)  strengthen  the  form  of  specification  in  the  test,  (2)  restrict  the  kind  of  errors  that 
the  test  must  expose.  Both  of  these  changes  have  a  natural  application  to  program  maintenance. 

[Haml78c]  Abstract:  This  paper  investigates  the  application  of  the  execution  time  theory  of  software  reliability 
to  operational  computation  center  software.  A  brief  review  of  software  reliability  concepts  is  provided.  Studies 
of  individual  operating  system  components  are  discussed,  as  well  as  a  functional  subsystem.  This  work  is  based 
on  data  taken  at  a  large  operating  computation  center  over  a  period  of  15  months. 

[Haml86]  Abstract:  Program  testing  for  confidence  requires  a  probabilistic  method,  because  it  is  impossible  for 
finite  tests  to  guarantee  correctness  except  under  very  unrealistic  restrictions.  Existing  sampling  theory  has  not 
been  successfully  applied  to  software  because  of  two  peculiar  problems:  (1)  an  “operational  distribution”  of 
input  data  is  seldom  an  appropriate  description  of  program  use,  and  (2)  sample  independence  has  a  difficult 
meaning  for  programs.  Both  problems  arise  because  faults  reside  in  the  textual  space  of  a  program,  but  tests 
probe  this  space  only  through  the  input  domain. 

A  theory  is  presented  in  which  tests  establish  a  probability  of  correctness,  as  opposed  to  predicting  future 
behavior  from  past  samples.  The  success  of  the  theory  depends  on  dividing  the  textual  and  input  spaces  into 
units  for  which  uniform  sampling  is  appropriate.  Preliminary  work  shows  that  far  more  test  points  are  needed  to 
gain  confidence  in  a  program  than  predicted  by  the  usual  sampling  theory. 

[Haml87]  Abstract:  A  theory  of  ‘probable  correctness’  is  proposed  to  assess  the  reliability  of  software  through 
testing.  Current  research  in  testing  is  not  adequate  for  this  assessment.  Most  testing  methods  are  intended  for 
debugging,  to  find  failures  and  connect  them  to  program  faults  for  repair.  When  these  methods  no  longer  expose 
errors,  no  analysis  has  been  done  to  find  the  confidence  that  may  be  placed  in  the  software.  (Preliminary  results 
here  are  that  this  confidence  should  be  low.)  Other  work  applies  conventional  decision  theory  to  inputs  as  sam¬ 
ples  of  a  program’s  use.  The  application  is  suspect  because  the  necessary  independence  and  distribution  assump¬ 
tions  may  be  violated;  in  any  case,  the  results  are  intuitively  incorrect.  The  proposed  theory  relies  on  a  uniform 
distribution  of  test  samples,  but  relates  these  to  textually  occurring  faults.  Preliminary  results  include  an  analysis 
of  partition  testing,  and  suggestions  for  textual  sampling.  It  is  crucial  that  any  such  confidence  theory  be  plausi¬ 
ble,  so  the  foundations  of  program  sampling  are  examined  in  detail. 

[Haml88]  Abstract:  Partition  testing,  in  which  a  program’s  input  domain  is  divided  according  to  some  rule  and 
tests  conducted  within  the  subdomains,  enjoys  a  good  reputation.  However,  comparison  between  testing  that 
observes  partition  boundaries  and  random  sampling  that  ignores  the  partitions  gives  the  counterintuitive  result 
that  partitions  are  of  little  value.  In  this  paper  we  improve  the  negative  results  published  about  partition  testing, 
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and  try  to  reconcile  them  with  its  intuitive  value.  Partition  testing  is  shown  to  be  more  valuable  than  random  test¬ 
ing  only  when  the  partitions  are  narrowly  based  on  expected  faults  and  there  is  a  good  chance  of  failure.  For  gain¬ 
ing  confidence  from  successful  tests,  partition  testing  as  usually  practiced  has  little  value. 

[Hane72]  Abbreviated  Introduction:  The  largest  challenge  facing  software  engineers  today  is  to  find  ways  to 
deliver  large  systems  on  schedule.  How  can  we  peer  into  the  hazy  contingency  portion  of  a  schedule  and  predict 
in  greater  detail  where  bugs  will  occur,  who  will  be  needed  to  fix  them,  elapsed  time  between  internal  releases, 
etc.?  Belady  and  Lehman  suggest  the  need  for  a  “micro-model”  for  system  activities,  i.e.,  a  model  based  on 
internal,  structural  aspects  of  a  system.  This  is  essentially  the  objective  of  this  paper.  In  the  following  sections, 
we  will  develop  a  very  simple,  but  useful,  technique  for  modeling  the  “stabilization”  of  a  large  system  as  a  func¬ 
tion  of  its  internal  structure. 

The  concrete  result  described  in  this  paper  is  a  simple  matrix  formula  which  serves  as  a  useful  model  for 
the  “rippling”  effect  of  changes  in  a  system.  The  real  emphasis  is  on  the  use  of  the  formula  as  a  model;  i.e.,  as  an 
aid  to  understanding.  The  formula  can  certainly  be  used  to  obtain  numeric  estimates  for  specific  systems,  but  its 
greater  value  is  that  it  helps  to  explain,  in  terms  of  system  structure  and  complexity,  why  the  process  of  changing 
a  system  is  generally  more  involved  than  our  intuition  lead  us  to  believe. 

[Hanf70]  Introduction:  The  “syntax  machine”  discussed  here  automatically  generates  random  test  cases  for  any 
suitably  defined  programming  language.  The  test  cases  it  produces  are  syntactically  valid  programs.  But  they  are 
not  “meaningful,”  and  if  an  attempt  is  made  to  execute  them,  the  results  are  unpredictable  and  uncheckable.  For 
this  reason,  they  are  less  valuable  than  handwritten  test  cases.  However,  as  an  inexhaustible  source  of  new  test 
material,  the  syntax  machine  has  shown  itself  to  be  a  valuable  tool. 

In  the  following  sections,  we  characterize  the  use  of  this  tool  in  testing  different  types  of  language  proces¬ 
sors,  introduce  the  concept  of  “dynamic  grammar”  of  a  programming  language,  outline  the  structure  of  the  sys¬ 
tem,  and  show  what  the  syntax  machine  does  by  means  of  some  examples. 

[Hans73]  Summary:  A  central  problem  in  program  design  is  to  structure  a  large  program  such  that  it  can  be 
tested  systematically  by  the  simplest  possible  techniques.  This  paper  describes  the  method  used  to  test  the  RC 
4000  multiprogramming  system.  During  testing,  the  system  records  all  transitions  of  processes  and  messages 
between  various  queues.  The  test  mechanism  consists  of  fifty  machine  instructions  centralized  in  two  pro¬ 
cedures.  By  using  this  mechanism  in  a  series  of  carefully  selected  test  cases,  the  system  was  made  virtually  error 
free  within  a  few  weeks.  The  test  procedure  is  illustrated  by  examples. 

[Hans78]  Abbreviated  Introduction:  In  a  recent  paper,  McCabe  introduced  the  cyclomatic  number  of  a  pro¬ 
gram’s  flow  graph  as  a  measure  of  its  complexity.  Myers  proposed  an  improved  measure  consisting  of  an  interval 
with  the  original  measure  as  its  upperbound.  [The  author]  will  argue  that  if  two  values  are  to  be  presented  as  a 
measure  it  is  preferable  to  couple  a  variation  of  the  cyclomatic  number  with  a  measure  of  the  program’s  expres¬ 
sion  complexity. 

[Hans84]  Abbreviated  Preface:  Software  designers  are  dissatisfied  with  the  present  status  of  quality  assurance 
and  control.  Methods  and  tools  are  being  developed  that  attempt  to  locate  errors  in  systems  or  to  demonstrate 
the  absence  of  such  errors.  Although  many  of  these  tools  are  still  experimental  and  difficult  to  use,  they  have 
been  used  successfully  in  a  number  of  applications.  Although  additional  research  on  validation  is  necessary,  it  is 
likely  that  developers  of  systems  could  make  good  use  of  some  of  these  tools  provided  they  are  robust  and  stable. 

From  the  point  of  assuring  quality  throughout  the  entire  life  cycle,  most  of  the  existing  methods  and  tools 
are  only  suitable  for  specific  phases  and  error  classes.  By  suitable  combinations  of  methods  and  tools,  quality 
assurance  and  control  should  become  more  effective.  The  required  combination  depends  on  the  specific  quality 
requirements,  on  the  current  situation  of  a  project,  and  last  but  not  least,  on  the  available  resources. 

The  goals  of  this  [book  are]  to  review  the  current  status  of  software  validation  technology,  to  provide  an 
in-depth  look  at  the  issues,  and  to  project  future  developments,  all  in  the  light  of  the  overall  aim  of  achieving  an 
integrated  framework  for  software  validation. 
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[Hant76]  Abstract:  This  paper  explains,  in  an  introductory  fashion,  the  method  of  specifying  the  correct 
behavior  of  a  program  by  use  of  input/output  assertions  and  describes  one  method  of  showing  that  the  program  is 
correct  with  respect  to  those  assertions.  An  initial  assertion  characterizes  conditions  expected  to  be  true  upon 
entry  to  the  program  and  a  final  assertion  characterizes  conditions  expected  to  be  true  upon  exit  from  the  pro¬ 
gram.  When  a  program  contains  no  branches,  a  technique  known  as  symbolic  executions  can  be  used  to  show 
that  the  truth  of  the  initial  assertion  upon  entry  guarantees  the  truth  of  the  final  assertion  upon  exit.  More  gen¬ 
erally,  for  a  program  with  branches  one  can  define  a  symbolic  execution  tree.  If  there  is  an  upper  bound  on  the 
number  of  times  each  loop  in  such  programs  can  be  executed,  a  proof  of  correctness  can  be  given  by  a  simple 
traversal  of  the  (infinite)  symbolic  execution  tree. 

However,  for  most  programs,  no  fixed  bound  of  the  number  of  times  each  loop  is  executed  exists  and  the 
corresponding  symbolic  execution  trees  are  infinite.  In  order  to  prove  the  correctness  of  such  programs,  a  more 
general  assertion  structure  must  be  provided.  The  symbolic  execution  tree  of  such  programs  must  be  traversed 
inductively  rather  than  explicitly.  This  leads  naturally  to  the  use  of  additional  assertions  which  are  called  “induc¬ 
tive  assertions.” 

[Harr81a]  Abbreviated  Introduction:  The  calculation  of  the  cyclomatic  number  proves  to  be  an  effective  com¬ 
plexity  measure.  However,  because  the  cyclomatic  measure  only  counts  the  number  of  basic  paths,  it  is  incapable 
of  recognizing  the  effects  of  two  major  complexity  factors  which  can  be  intuitively  seen  to  increase  program  com¬ 
plexity.  These  two  items  are  the  complexity  of  the  individual  blocks  within  the  program  -  which  we  shall  refer  to 
as  “program  magnitude,”  and  the  program. 

[Harr81b]  Conclusion:  Two  programs  may  be  of  the  same  length  and  possess  equivalent  properties  in  all 
respects  except  for  the  control  structure  configuration.  We  have  illustrated  the  variations  in  complexity  which 
may  arise  from  such  situations  by  using  two  topological  measures,  viz.,  McCabe’s  Cyclomatic  Complexity 
Number  and  the  Scope  Complexity  Ratio.  The  Scope  Measure  is  able  to  distinguish  among  programs  which  the 
cyclomatic  number  measure  considers  to  be  equally  complex. 

[Harr82]  Abbreviated  Introduction:  Over  the  past  several  years,  computer  scientists  have  devoted  a  great  deal 
of  effort  to  measuring  computer  program  “complexity,”  since  many  large  software  systems  can  be  used  for  10, 
15,  or  even  20  years.  A  large  part  of  that  time  involves  maintenance  activities,  which  include  all  changes  made  to 
a  piece  of  software  after  it  has  been  delivered  to  and  accepted  by  the  final  user.  Consequently,  maintenance  is 
most  affected  by  program  complexity. 

Recent  estimates  suggest  that  about  40  to  70  percent  of  annual  software  expenditures  involve  maintenance 
of  existing  systems.  Clearly,  if  complexities  could  somehow  be  identified,  then  programmers  could  adjust 
maintenance  procedures  accordingly.  What  is  needed  is  some  method  of  pinpointing  the  characteristics  of  a 
computer  program  that  are  difficult  to  maintain  and  measuring  the  degree  of  their  presence  (or  lack  of  it).  Vari¬ 
ous  approaches  may  be  taken  in  measuring  complexity  characteristics,  such  as  Baird  and  Noma’s  approach,  in 
which  scales  of  measurement  are  divided  into  the  following  four  types: 

1.  Nominal  scales. 

2.  Ordinal  scales. 

3.  Interval  scales. 

4.  Ratio  scales. 

We  believe  the  ordinal  scale  to  be  the  best  choice  for  examining  complexity  metrics,  and  all  measures  dis¬ 
cussed  in  this  article  are  in  that  framework. 

[Harr85]  Summary:  We  have  developed  a  Reduced  Form  which  allows  software  complexity  data  to  be  shared 
among  researchers,  and  at  the  same  time  prevents  the  reconstruction  of  the  actual  source  code,  thus  preserving 
the  confidentiality  of  the  software.  This  is  a  major  concern  of  many  organizations  considering  participating  in 
metric  research. 

Some  current  metrics  can  be  obtained  from  the  Reduced  Form.  Additional  work  must  be  done  to  include 
information  needed  for  other  metrics  before  this  tool  can  be  finalized.  It  is  hoped  that  this  paper  will  spark  an 


258 


August  9, 1989 


interest  in  other  complexity  metric  researchers  who  can  contribute  to  the  development  of  the  Reduced  Form. 
Those  interested  should  see  the  companion  paper  in  this  issue. 

[HarrMb]  Abbreviated  Introduction:  Over  the  past  decade,  numerous  attempts  have  been  made  to  develop 
software  complexity  measurements.  Formulating  such  measures  would  allow  us  to  compare  two  programs  and 
see  which  was  more  complex.  This  article  describes  one  very  popular  approach  to  measuring  software  complex¬ 
ity:  software  science. 

[Harr88c]  Abstract:  There  have  been  several  efforts  to  use  symbolic  execution  to  test  and  analyze  concurrent 
programs.  Recently  proof  systems  have  also  emerged  for  concurrent  programs  and  for  the  Ada  language  in  par¬ 
ticular.  This  paper  focuses  on  using  symbolic  properties  of  Ada  programs.  It  expands  upon  past  efforts  by  incor¬ 
porating  tasking  proof  rules  into  the  symbolic  executor  allowing  Ada  programs  with  tasking  to  be  formally  veri¬ 
fied. 

[Hart71]  Abstract:  The  purpose  of  this  paper  is  to  outline  the  theory  of  computational  complexity  which  has 
emerged  as  a  comprehensive  theory  during  the  last  decade.  This  theory  is  concerned  with  the  quantitative 
aspects  of  computations  and  its  central  theme  is  the  measuring  of  the  difficulty  of  computing  functions.  The 
paper  concentrates  on  the  study  of  computational  complexity  measures  defined  for  all  computable  functions  and 
makes  no  attempt  to  survey  the  whole  field  exhaustively  nor  to  present  the  material  in  historical  order.  Rather  it 
presents  the  basic  concepts,  results,  and  techniques  of  computational  complexity  from  a  new  point  of  view  from 
which  the  ideas  are  more  easily  understood  and  fit  together  as  a  coherent  whole. 

[Hart79]  Abstract:  The  Advanced  Interactive  Debugging  System  (AIDS)  is  described.  It  is  a  powerful  high- 
level  symbolic  interactive  debugging  aid.  AIDS  is  intended  to  be  available  in  a  program’s  environment  without 
requiring  debugging  statements  in  the  program’s  source  code  or  inclusion  of  AIDS  in  the  program’s  executable 
module. 

[Hass80]  Abstract:  White  and  Cohen  have  proposed  the  domain  testing  method,  which  attempts  to  uncover 
errors  in  a  path  domain  by  selecting  test  data  on  and  near  the  boundary  of  the  path  domain.  The  goal  of  domain 
testing  is  to  demonstrate  that  the  boundary  is  correct  within  an  acceptable  error  bound.  Domain  testing  is  intui¬ 
tively  appealing  in  that  it  provides  a  method  for  satisfying  the  often  suggested  guideline  that  boundary  conditions 
should  be  tested. 

In  addition  to  proposing  the  domain  testing  method,  White  and  Cohen  have  developed  a  test  data  selec¬ 
tion  strategy,  which  attempts  to  satisfy  this  method.  Further,  they  have  described  two  error  measures  for  evaluat¬ 
ing  domain  testing  strategies.  This  paper  takes  a  close  look  at  their  strategy  and  their  proposed  error  measures.  It 
is  shown  that  inordinately  large  domain  errors  may  remain  undetected  by  the  White  and  Cohen  strategy.  Two 
alternative  domain  testing  strategies,  which  improve  on  the  error  bound,  are  then  proposed  and  the  complexity 
of  each  of  the  three  strategies  is  analyzed.  Finally,  several  other  issues  that  must  be  addressed  by  domain  testing 
are  presented  and  the  general  applicability  of  this  method  is  discussed. 

[Hech75]  Abstract:  A  simple,  iterative  bit  propagation  algorithm  for  solving  global  data  flow  analysis  problems 
such  as  “available  expressions”  and  “live  variables”  is  presented  and  shown  to  be  quite  comparable  in  speed  to 
the  corresponding  interval  analysis  algorithm.  This  comparison  is  facilitated  by  a  result  relating  two  parameters 
of  a  reducible  flow  graph  (rfg).  Namely,  if  G  is  an  rfg,  d  is  the  largest  number  of  back  edges  found  in  any  cycle- 
free  path  in  G,  and  k  is  the  length  of  the  interval  derived  sequence  of  G,  then  k>=d.  (Intuitively,  k  is  the  max¬ 
imum  nesting  depth  of  loops  in  a  computer  program,  while  d  is  a  measure  of  the  maximum  loop-interconnected¬ 
ness.)  The  node  ordering  employed  by  the  simple  algorithm  is  the  reverse  of  the  order  in  which  a  node  is  last 
visited  while  growing  any  depth-first  spanning  tree  of  the  flow  graph.  In  addition,  a  dominator  algorithm  for  an 
rfg  is  presented  which  takes  O(edges)  bit  vector  steps. 

[Hech77a]  Abbreviated  Preface:  This  book  presents  a  theoretical  foundation  for  the  pre-execution  analysis  of 


August  9, 1989 


computer  programs  that  is  usually  referred  to  as  control  flow  analysis  and  data  flow  analysis.  Flow  analysis  is  a 
fundamental  prerequisite  for  many  important  types  of  code  improvement.  In  general,  control  flow  analysis  pre¬ 
cedes  data  flow  analysis.  Control  flow  analysis  is  the  encoding  of  pertinent,  possible  program  control  flow  struc¬ 
ture  or  flow  of  control,  usually  in  the  form  of  one  or  more  graphs.  Data  flow  analysis  is  the  process  of  ascertain¬ 
ing  and  collecting  information  prior  to  program  execution  about  the  possible  modification,  preservation,  and  use 
of  certain  entities  (such  as  values  or  various  attributes  of  variables)  in  a  computer  program. 

The  primary  goal  of  this  book  is  to  teach  people  algorithms  to  incorporate  in  code  improvers.  However, 
these  algorithms  do  not  perform  various  code  improvements  per  se,  but  instead  gather  information  prerequisite 
to  many  code  improvements.  Thus,  the  subject  of  this  book  is  not  code  improvement,  but  only  one  constituent 
process  used  in  many  code  improvers.  The  reader  will  be  introduced  to  typical  problems  requiring  flow  analysis 
algorithms  and  the  theoretical  foundation  for  these  algorithms. 

[Hech79]  Abstract:  Limitations  in  the  current  capabilities  for  verifying  programs  by  formal  proof  or  by  exhaus¬ 
tive  testing  have  led  to  the  investigation  of  fault-tolerance  techniques  for  applications  where  the  consequence  of 
failure  is  particularly  severe.  Two  current  approaches,  N-version  programming  and  the  recovery  block,  are 
described.  A  critical  feature  in  the  latter  is  the  acceptance  test,  and  a  number  of  useful  techniques  for  construct¬ 
ing  these  are  presented.  A  system  reliability  model  for  the  recovery  block  is  introduced,  and  conclusions  derived 
from  this  model  that  affect  the  design  of  fault-tolerant  software  are  discussed. 

[Hech80]  Abstract:  Many  new  uses  of  computers  require  extremely  high  reliability  of  the  computing  function  as 
a  whole,  and  the  software  involved  must  conform  to  these  requirements.  In  more  conventional  applications,  the 
attainment  of  higher  software  reliability  will  be  of  great  economic  benefit.  A  study  of  one  scientific  computing 
center,  in  which  a  good  deal  of  effort  had  been  devoted  to  providing  a  highly  dependable  facility,  showed  386  ser¬ 
vice  interruptions  in  one  year.  Of  these,  227  (almost  60%)  were  due  to  software  problems.  In  software  generated 
for  the  military  services,  quality  assurance  provisions  are  now  being  invoked  which  are  motivated  by  the  need  for 
higher  reliability  as  well  as  for  greater  ease  of  maintenance. 

[Hell72]  Abstract:  The  computational  work  of  a  process  is  measured  in  terms  of  the  information  in  a  memory 
for  its  table-lookup  implementation.  This  measure  is  applied  first  to  simple  logical  and  arithmetic  processes,  and 
then  more  complicated  processes  comprising  organizations  (called  synergisms)  of  several  subprocesses.  The 
computational  advantages  of  Cartesian,  compositional,  and  sequential  synergisms  are  investigated  and  illus¬ 
trated  by  means  of  the  work  measure.  The  relation  between  the  work  of  a  process  and  the  work  capacity  of  a 
facility  on  which  it  is  implemented  is  examined,  and  a  concept  of  efficiency  of  implementations  is  formulated.  A 
few  areas  for  further  investigation  are  outlined. 

[Heil87]  Abstract:  This  document  presents  procedures  to  be  followed  by  flight  dynamics  software  development 
projects  that  are  monitored  by  the  Software  Engineering  Laboratory  (SEL)  for  collecting  data  in  support  of 
software  engineering  research  activities.  An  overview  of  data  collection  during  the  life  cycle  of  a  development 
project  is  presented.  This  overview  is  followed  by  a  discussion  of  the  manner  in  which  the  SEL  measures  the 
structure  and  growth  of  the  software  product.  Finally,  detailed  instructions  for  the  completion  and  sub-mission  of 
SEL  data  collection  forms  are  presented. 

[Helm83]  Abstract:  A  runtime  monitoring  system  for  detecting  and  describing  tasking  errors  in  Ada  programs  is 
presented. 

Basic  concepts  for  classifying  tasking  errors,  called  deadness  errors,  are  defined.  These  concepts  indicate 
which  aspects  of  an  Ada  computation  must  be  monitored  in  order  to  detect  deadness  errors  resulting  from 
attempts  to  rendezvous  or  terminate.  They  also  provide  a  basis  for  the  definition  and  proof  of  correct  detection. 
Descriptions  of  deadness  errors  are  given  in  terms  of  the  basic  concepts. 

The  monitoring  systems  has  two  parts:  (1)  a  separately  compiled  runtime  monitor  that  is  added  to  any  Ada 
source  text  to  be  monitored,  and  (2)  a  pre-processor  that  transforms  the  Ada  source  text  so  that  necessary 
descriptive  data  is  communicated  to  the  monitor  at  runtime.  Some  basic  preprocessing  transformations  and  an 
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abstract  monitoring  for  a  limited  class  of  errors  were  previously  presented.  Here  an  Ada  implementation  of  a 
monitor  and  a  more  extensive  set  of  pre-processing  transformations  are  described.  This  system  provides  an 
experimental  automated  tool  for  detecting  deadness  errors  in  Ada83  tasking  and  supplies  useful  diagnostics.  The 
use  of  the  runtime  monitor  for  debugging  and  programming  evasive  actions  to  avoid  imminent  errors  is  described 
and  examples  of  experiments  are  given. 

[HelmMa]  Abstract:  A  new  class  of  errors,  not  found  in  sequential  languages,  can  result  when  the  tasking  con¬ 
structs  of  Ada  are  used.  These  errors  are  called  deadness  errors  and  arise  when  task  communication  fails.  Since 
deadness  errors  often  occur  intermittently,  they  are  particularly  hard  to  detect  and  diagnose.  Previous  papers 
describe  the  theory  and  implementation  of  runtime  monitors  to  detect  deadness  errors  in  tasking  programs.  The 
problems  of  detection  and  description  of  errors  are  different.  Even  when  a  dead  state  is  detected,  giving  ade¬ 
quate  diagnostics  that  enable  the  programmer  to  locate  its  cause  in  the  Ada  text  is  difficult.  This  paper  discusses 
the  use  of  simple  diagnostic  descriptions  based  on  Ada  tasking  concepts.  These  diagnostics  are  implemented  in 
an  experimental  runtime  monitor.  Similar  facilities  could  be  implemented  in  task  debuggers  in  forthcoming  Ada 
support  environments.  Their  usefulness  and  shortcomings  are  illustrated  in  an  example  experiment  with  the  run¬ 
time  monitor.  Possible  future  directions  in  task  error  monitoring  and  diagnosis  based  on  formal  specifications  are 
discussed. 

[Helm85]  Abstract:  TSL  is  a  language  for  specifying  sequences  of  tasking  events  in  Ada  programs.  TSL  specifi¬ 
cations  are  submitted  with  an  Ada  program  and  are  monitored  at  runtime  for  consistency  with  the  actual  tasking 
events  as  they  occur.  This  paper  presents  a  preliminary  design  for  TSL,  an  informal  overview  of  its  capabilities, 
and  an  operational  semantics. 

[Hend75]  Abstract:  A  technique  is  presented  whereby  a  significant  amount  of  program  validation  can  be  done 
simply  by  exercising  the  program  components  in  a  model  environment  provided  by  a  finite  state  machine,  spe¬ 
cially  built  to  characterise  the  real  environment  of  that  component.  The  tools  necessary  to  support  such  a  tech¬ 
nique  are  characterised  and  the  merits  and  demerits  of  the  technique  are  discussed. 

[Henn76b]  Abbreviated  Introduction:  In  this  note  we  introduce  the  concept  of  a  Linear  Code  Sequence  and 
jump  (LCSAJ)  triple.  With  the  aid  of  these  LCSAJs  it  is  possible  to  analyze  both  the  static  and  dynamic  charac¬ 
teristics  of  computer  programs.  In  particular  they  have  been  extensively  used  in  the  analysis  of  numerical  algo¬ 
rithms  in  both  FORTRAN  IV  and  ALGOL  68. 

[Henn78]  Abstract:  This  paper  describes  an  experimental  testbed  facility  designed  to  examine  some  of  the  prob¬ 
lems  which  arise  in  the  implementation  of  high  quality  numerical  software  libraries. 

The  testbed  is  used  to  measure  the  effectiveness  of  test  programs.  Effectiveness  here  is  used  in  the  sense 
that  these  test  programs  should  ensure  that  the  routine  implementation  is  error  free  rather  than  to  examine  the 
»  'imerical  properties  of  the  algorithm. 

The  testbed  has  been  used  in  extensive  investigations  of  the  stringent  test  programs  of  the  NAG  numerical 
u'gorithms  library  and  continuation  of  this  work  is  seen  as  a  major  application  for  the  testbed. 

f  (cnn84]  Abstract:  The  roles  and  capabilities  of  LDRA  software  Testbeds  and  their  appropriate  environments 
h  ive  been  described  in  a  number  of  papers.  The  way  in  which  management  uses  the  tools  as  elements  of  a  con¬ 
trolled  software  development  environment  is  described.  The  principal  benefits  of  such  use  are  that  management 
has  the  assurance  that  software  development  standards  are  enforced,  and  has  reliable  information  concerning 
project  status.  The  explicit  standards  enforced  by  use  of  these  tools  are  described  in  detail  [elsewhere].  One  class 
of  these  standards  is  that  of  test  effectiveness,  which  is  measured  primarily  through  three  test  effectiveness 
metrics  reinforced  by  a  code  auditing  capability. 

This  paper  attempts  to  quantify  the  benefits  of  using  such  a  software  Testbed  in  providing  assurance  of  the 
absence  of  program  errors.  The  attempt  is  made  from  two  viewpoints,  the  theoretical  and  the  experimental. 

The  theoretical  aspect  is  important  because  the  practical  use  of  a  tool  may  fail  to  demonstrate  that  the  tool 
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can  be  a  powerful  detector  of  a  class  of  errors  simply  because  no  errors  of  that  type  were  present  in  the  software 
sample  validated. 

Finally  the  paper  attempts  to  summarize  some  of  the  experiences  gained  through  the  use  of  the  tools  over 
a  twelve  year  period. 

[HennXX]  Abbreviated  Introduction:  The  principal  objective  in  functional  testing  is  to  verify  that  a  software  sys¬ 
tem  satisfies  the  requirements.  This  is  achieved  by  constructing  test  data  which  in  some  way  explores  each  of  the 
possibly  many  functions  which  the  system  is  required  to  perform. 

When  the  requirements  are  expressed  in  terms  of  functions,  then  it  is  possible  to  expand  the  detail  until 
the  requirements  are  expressed  in  terms  of  Basic  User  Perceived  functions.  Termination  criteria  for  this  process 
are  difficult  to  specify-the  point  is  that  there  should  be  no  further  fine  structures  as  perceived  by  the  users.  One 
task  of  functional  testing  is  to  take  each  of  these  Basic  User  Perceived  functions  and  supply  test  data  to  exercise 
them.  In  general,  it  is  not  sensible  or  even  possible  to  test  them  individually,  so  they  must  be  tested  in  various 
combinations. 

There  is  now  a  considerable  body  of  functional  tests  which  have  been  investigated  by  using  structural  test¬ 
ing  metrics.  That  is,  after  the  functional  tests  have  been  completed,  the  software  and  test  data  are  examined  by 
structural  testing  techniques  to  obtained  the  coverage  metrics.  The  overall  results  of  traditional  functional  tests 
are  not  impressive  in  terms  of  exercising  the  software.  However,  it  has  not  been  clear  what  further  improvements 
should  be  made.  It  is  the  objective  of  this  chapter  to  improve  this  position. 

[Henr81a]  Abstract:  Automatable  metrics  of  software  quality  appear  to  have  numerous  advantages  in  the 
design,  construction  and  maintenance  of  software  systems.  While  numerous  such  metrics  have  been  defined,  and 
several  of  them  have  been  validated  on  actual  systems,  significant  work  remains  to  be  done  to  establish  the  rela¬ 
tionships  among  these  metrics.  This  paper  reports  the  results  of  correlation  studies  made  among  three  complex¬ 
ity  metrics  which  were  applied  to  the  same  software  system.  The  three  complexity  metrics  used  were  Halstead’s 
effort,  McCabe’s  cyclomatic  complexity  and  Henry  and  Kafura’s  information  flow  complexity.  The  common 
software  system  was  the  UNIX  operating  system.  The  primary  result  of  this  study  is  that  Halstead’s  and 
McCabe’s  metrics  are  highly  correlated  while  the  information  flow  metric  appears  to  be  an  independent  measure 
of  complexity. 

[HenrSlb]  Abstract:  Structured  design  methodologies  provide  a  disciplined  and  organized  guide  to  the  con¬ 
struction  of  software  systems.  However,  while  the  methodology  structures  and  documents  the  points  at  which 
design  decisions  are  made,  it  does  not  provide  a  specific,  quantitative  basis  for  making  these  decisions.  Typi¬ 
cally,  the  designers’  only  guidelines  are  qualitative,  perhaps  even  vague,  principles  such  as  “functionality,”  “data 
transparency,”  or  “clarity.”  This  paper,  like  several  recent  publications,  defines  and  validates  a  set  of  software 
metrics  which  are  appropriate  for  evaluating  the  structure  of  large-scale  systems.  These  metrics  are  based  on  the 
measurement  of  information  flow  between  system  components.  Specific  metrics  are  defined  for  procedure  com¬ 
plexity,  module  complexity,  and  module  coupling.  The  validation,  using  the  source  code  for  the  UNIX  operating 
system,  shows  that  the  complexity  measures  are  strongly  correlated  with  the  occurrence  of  changes.  Further,  the 
metrics  for  procedures  and  modules  can  be  interpreted  to  reveal  various  types  of  structural  flaws  in  the  design 
and  implementation. 

[Henr85]  Abstract:  This  paper  describes  the  development  of  a  procedure  for  evaluating  software  engineering 
methodologies.  In  formulating  this  evaluation  procedure,  the  first  question  addressed  is-What  constitutes  a 
methodology ?  Using  this  discussion  as  a  basis,  we  then  establish  a  linkage  of  objectives,  principles,  and  attri¬ 
butes  that  are  intrinsic  to  an  “ideal”  methodology  and  which  reflects  an  assessment  structured  by  the  needs,  pro¬ 
cess,  and  product  sequence  for  system  development.  This  linkage  is  based  on  universally  accepted  software 
engineering  goals,  and  provides  a  comparative  scale  for  assessing  the  relative  “goodness”  of  a  given  methodology. 
The  final  section  of  this  paper  discusses  the  application  of  this  procedural  evaluation  approach  to  the  Software 
Cost  Reduction  (SCR)  methodology  currently  used  by  the  United  States  Navy.  This  example  reveals  the  inherent 
power  of  the  procedural  approach  in  evaluating  software  development  methodologies. 
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[Henr88a]  Abstract:  Maintenance  of  software  makes  up  a  large  fraction  of  the  time  and  money  spent  in  the 
software  life  cycle.  By  reducing  the  need  for  maintenance  these  costs  can  also  be  reduced.  Predicting  where 
maintenance  is  likely  to  occur  can  help  to  reduce  maintenance  by  prevention.  This  paper  details  a  study  of  the 
use  of  software  quality  metrics  to  determine  high  complexity  components  in  a  software  system.  By  the  use  of  a 
history  of  maintenance  done  on  a  particular  system,  it  is  shown  that  a  predictor  equation  can  also  be  developed 
to  identify  components  which  needed  maintenance  activities.  This  same  equation  can  be  used  to  determine 
which  components  are  likely  to  need  maintenance  in  the  future.  Through  the  use  of  these  predictions  and 
software  metric  complexities  it  should  be  possible  to  reduce  the  complexity  of  that  component  through  further 
decomposition.  Even  though  this  is  only  one  study,  this  methodology  of  developing  maintenance  predictors 
could  be  applied  in  any  environment. 

[Henr88b]  Abstract:  In  this  paper  we  describe  our  initial  work  on  a  long-term  project  to  develop  and  validate  a 
reliability  model  and  a  new  class  of  software  complexity  metrics  which  are  related  to  this  model.  In  contrast  to 
previous  “black  box”  approaches,  the  reliability  model  is  novel  because  it  incorporates  knowledge  about  the  sys¬ 
tem  in  the  form  of  quantitative  software  complexity  metrics.  While  the  initial  model  uses  existing  software 
metrics  a  parallel  effort  in  this  project  is  investigating  new  classes  of  metrics,  interface  and  dynamic  metrics, 
which  are  useful  in  their  own  right  but  are  also  of  particular  relevance  to  the  reliability  model.  The  initial  defini¬ 
tion  of  both  the  model  and  the  metrics  are  given  along  with  a  description  of  the  next  research  milestones. 

[HenrXX]  Abstract:  For  many  years  the  software  engineering  community  has  been  attacking  the  software  relia¬ 
bility  problem  on  two  fronts.  First  via  design  methodologies,  languages  and  tools  as  a  precheck  on  quality  and 
second  by  measuring  the  quality  of  produced  software  as  a  postcheck.  This  research  attempts  to  unify  the 
approach  to  creating  reliable  software  by  providing  the  ability  to  measure  the  quality  of  a  design  prior  to  its 
implementation.  A  comparison  of  a  graphical  and  a  textual  design  language  is  presented  in  an  effort  to  support 
research  findings  that  the  human  brain  works  more  effectively  in  images  than  in  text. 

[Hetz84]  Abbreviated  Preface:  The  quality  of  systems  developed  and  maintained  in  most  organizations  is  poorly 
understood  and  below  standard.  This  book  explains  how  software  can  be  tested  effectively  and  how  to  manage 
that  effort  within  a  project  or  organization.  It  demonstrates  that  good  testing  practices  are  the  key  to  controlling 
and  improving  software  quality  and  explains  how  to  develop  and  implement  a  balanced  testing  program  to 
achieve  significant  quality  improvements. 

The  book  covers  the  discipline  of  software  testing:  what  testing  means,  how  to  define  it,  how  to  measure 
it,  and  how  to  ensure  its  effectiveness.  The  term  software  testing  is  used  broadly  to  include  the  full  scope  of  what 
is  sometimes  referred  to  as  test  and  evaluation  or  verification  and  validation  activities.  Software  testing  is  viewed 
as  the  continuous  task  of  planning,  designing,  and  constructing  tests,  and  of  using  those  tests  to  assess  and  evalu¬ 
ate  the  quality  of  work  performed  at  each  step  of  the  system  development  process.  Both  the  why  and  how  are 
considered.  The  why  addresses  the  underlying  principles,  where  the  concepts  came  from,  and  why  it  is  impor¬ 
tant.  The  how  is  practical  and  explains  the  method  and  management  practices  so  that  they  may  be  easily  under¬ 
stood  and  put  into  use. 

[Hibb82]  Abstract:  This  paper  reports  on  the  status  of  a  research  project  to  develop  compiler  techniques  to 
optimize  programs  for  execution  on  an  asynchronous  multiprocessor.  We  adopt  a  simplified  model  of  a  multipro¬ 
cessor,  consisting  of  several  identical  processors,  all  sharing  access  to  a  common  memory.  Synchronization  must 
be  done  explicitly,  using  two  special  operations  that  take  a  period  of  time  comparable  to  the  cost  of  data  opera¬ 
tions.  Our  treatment  differs  from  other  attempts  to  generate  code  for  such  machines  because  we  treat  the  neces¬ 
sary  synchronization  overhead  as  an  integral  part  of  the  cost  of  a  parallel  code  sequence.  We  are  particularly 
interested  in  heuristics  that  can  be  used  to  generate  good  code  sequences,  and  local  optimizations  that  can  then 
be  applied  to  improve  them.  Our  current  efforts  are  concentrated  on  generating  straight-line  code  for  high-level, 
algebraic  languages. 

We  compare  the  code  generated  by  two  heuristics,  and  observe  how  local  optimization  schemes  can  gradu¬ 
ally  improve  its  quality.  We  are  implementing  our  techniques  in  an  experimental  compiler  that  will  generate  code 
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for  Cm*,  a  real  multiprocessor,  having  several  characteristics  of  our  model  computer. 

[HU183]  Abstract:  This  note  describes  RED,  a  remotely  executed  debugger  capable  of  generating  a  real-time 
source  level  trace  history  of  a  high  level  language  program  executing  on  a  microprocessor.  The  trace  history  con¬ 
sists  of  a  display  of  the  source  statements  of  each  basic  block  executed,  annotated  by  the  time  at  which  execution 
of  that  block  began.  Basic  blocks  are  traced  rather  than  statements  to  reduce  sampling  bandwidth  requirements 
while  still  retaining  the  ability  to  record  the  essential  logical  flow  of  programs.  RED  is  intended  to  assist  in  debug¬ 
ging  stand-alone  high  level  language  process  control  programs  with  real-time  constraints. 

We  outline  two  possible  implementation  schemes  for  generating  the  real-time  trace  history.  In  both,  a 
“debugging  co-processor”  collects  in  a  history  buffer  the  values  of  the  program  counter  (PC)  and  the  correspond¬ 
ing  value  of  a  clock  as  each  basic  block  begins  execution.  The  debugger,  which  runs  on  the  processor  hosting  the 
compiler  and  has  access  to  the  co-processor  over  a  fast  link,  reconstructs  a  source  level  trace  from  the  PC-time 
pairs  in  the  history  buffer.  In  one  scheme,  the  language  compiler  emits  an  extra  instruction  at  the  beginning  of 
each  basic  block  in  the  program  to  output  the  value  of  the  program  counter  to  a  parallel  port  connected  to  the 
debug  processor.  The  second  method  makes  use  of  an  extended  target  memory  space  to  provide  tag  bits  denoting 
basic  blocks.  When  an  instruction  is  fetched,  the  debug  processor  detects  the  presence  of  the  tag  bits  and  buffers 
up  the  value  of  the  corresponding  program  counter  and  time.  The  first  method  is  simpler  to  implement,  requiring 
only  conventional,  usually  straightforward  hardware  additions  to  the  target,  but  requires  the  execution  overhead 
of  the  extra  instructions.  In  both  cases  the  debugger  itself  rims  on  the  host  processor  and  has  access  to  tables 
generated  during  compile  time  of  the  source  program. 

[Hite88]  Abstract:  Testing  programs  with  tractable  algorithms  is  one  area  in  which  software  engineers  have 
made  numerous  advances  over  the  past  few  decades.  Testing  rule-based  expert  systems,  however,  is  a  new  area 
in  software  engineering  which  requires  new  techniques. 

For  the  most  part,  traditional  software  engineering  testing  strategies  assume  modular  program  develop¬ 
ment.  This  assumption  is  impractical  to  make  for  expert  system  development,  for  the  knowledge  base  of  an 
expert  system  is  quite  simply  a  huge  non-modular  program.  It  consists  almost  entirely  of  non-ordered,  multi¬ 
branching  decision  statements.  In  traditional  programming,  the  module  interfaces  are  limited  and  well  defined. 
For  rule-based  expert  systems,  the  interaction  among  rules  is  combinatoric  and  highly  data-driven.  Thus  the  test¬ 
ing  of  a  completed  expert  system  via  traditional  path  analysis  is  impractical. 

The  design  of  a  testing  strategy  for  expert  systems  focuses  on  the  generic  phases  of  expert  system  develop¬ 
ment.  Briefly,  these  phases  include  system  definition,  incremental  system  implementation,  and  system  mainte¬ 
nance.  Using  the  simplified  breakdown  of  the  expert  system  development  process  as  a  guide,  certain  testing  tech¬ 
niques  can  be  generalized  enough  to  work  for  any  expert  system  application. 

[Hoar69]  Abstract:  In  this  paper  an  attempt  is  made  to  explore  the  logical  foundations  of  computer  program¬ 
ming  by  use  of  techniques  which  were  first  applied  in  the  study  of  geometry  and  have  later  been  extended  to  other 
branches  of  mathematics.  This  involves  the  elucidation  of  sets  of  axioms  and  rules  of  inference  which  can  be 
Used  in  proofs  of  the  properties  of  computer  programs.  Examples  are  given  of  such  axioms  and  rules,  and  a  for¬ 
mal  proof  of  a  simple  theorem  is  displayed.  Finally,  it  is  argued  that  important  advantages,  both  theoretical  and 
practical,  may  follow  from  pursuance  of  these  topics. 

[Hoar71b]  Abstract:  A  proof  is  given  of  the  correctness  of  the  algorithm  “Find.”  First,  an  informal  description 
is  given  of  the  purpose  of  the  program  and  the  method  used.  A  systematic  technique  is  described  for  construct¬ 
ing  the  program  proof  during  the  process  of  coding  it,  in  such  a  way  as  to  prevent  the  intrusion  of  logical  errors. 
The  proof  of  termination  is  treated  as  a  separate  exercise.  Finally,  some  conclusions  relating  to  general  program¬ 
ming  methodology  are  drawn. 

[Hoar72]  Introduction:  In  the  development  of  programs  by  stepwise  refinement,  the  programmer  is  encouraged 
to  postpone  the  decision  on  the  representation  of  his  data  until  after  he  has  designed  his  algorithm,  and  has 
expressed  it  as  an  “abstract”  program  operating  on  “abstract”  data.  He  then  chooses  for  the  abstract  data  some 
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convenient  and  efficient  concrete  representation  in  the  store  of  a  computer;  and  finally  programs  the  primitive 
operations  required  by  his  abstract  program  in  terms  of  this  concrete  representation.  This  paper  suggests  an 
automatic  method  of  accomplishing  the  transition  between  an  abstract  and  a  concrete  program,  and  also  a 
method  of  proving  its  correctness;  that  is,  of  proving  that  the  concrete  representation  exhibits  all  the  properties 
expected  of  it  by  the  “abstract”  program.  A  similar  suggestion  [has”  been]  made  more  formally  in  algebraic 
terms,  which  gives  a  general  definition  of  simulation.  However,  a  more  restricted  definition  may  prove  to  be 
more  useful  in  practical  program  proofs. 

If  the  data  representation  is  proved  correct,  the  correctness  of  the  final  concrete  program  depends  only  on 
the  correctness  of  the  original  abstract  program.  Since  abstract  programs  are  usually  very  much  shorter  and 
easier  to  prove  correct,  the  total  task  of  proof  has  been  considerably  lightened  by  factorising  it  in  this  way.  Furth¬ 
ermore,  the  two  parts  of  the  proof  correspond  to  the  successive  stages  in  program  development,  thereby  contri¬ 
buting  to  a  constructive  approach  to  the  correctness  of  programs.  Finally,  it  must  be  recalled  that  in  the  case  of 
larger  and  more  complex  programs  the  description  given  above  in  terms  of  two  stages  readily  generalizes  to  mul¬ 
tiple  stages. 

[Hoar74]  Abstract:  This  paper  develops  Brincb-Hansen’s  concept  of  a  monitor  as  a  method  of  structuring  an 
operating  system.  It  introduces  a  form  of  synchronization,  describes  a  possible  method  of  implementation  in 
terms  of  semaphores  and  gives  a  suitable  proof  rule.  Illustrative  examples  include  a  single  resource  scheduler,  a 
bounded  buffer,  an  alarm  clock,  a  buffer  pool,  a  disk  head  optimizer,  and  a  version  of  the  problem  of  readers  and 
writers. 

[Hoar75]  Abbreviated  Abstract:  This  paper  distinguishes  a  number  of  ways  of  using  parallelism,  including  dis¬ 
joint  processes,  competition,  cooperation,  and  communication.  In  each  case  an  axiomatic  proof  rule  is  given. 

[Hoai78]  Abstract:  This  paper  suggests  that  input  and  output  are  basic  primitives  of  programming  and  that 
parallel  composition  of  communicating  sequential  processes  is  a  fundamental  program  structuring  method. 
When  combined  with  a  development  of  Dijkstra’s  guarded  command,  these  concepts  are  surprisingly  versatile. 
Their  use  is  illustrated  by  sample  solutions  of  a  variety  of  familiar  programming  exercises. 

[HoarSl]  Abstract:  A  process  communicates  with  its  environment  and  with  other  processes  by  synchronized 
output  and  input  on  named  channels.  The  current  state  of  a  process  is  defined  by  the  sequences  of  messages 
which  have  passed  along  each  of  the  channels,  and  by  the  sets  of  messages  that  may  next  be  passed  on  each  chan¬ 
nel.  A  process  satisfies  an  assertion  if  the  assertion  is  at  all  times  true  of  all  possible  states  of  the  process.  The 
author  presents  a  calculus  for  proving  that  a  process  satisfies  the  assertion  describing  its  intended  behaviour.  The 
following  constructs  are  axiomatised:  output;  input;  simple  recursion;  disjoint  parallelism;  channel  renaming, 
connection  and  hiding;  process  chaining;  nondeterminism;  conditional;  alternation;  and  mutual  recursion.  The 
calculus  is  illustrated  by  proof  of  a  number  of  simple  buffering  protocols. 

[Hoar87]  Introduction:  The  code  of  a  computer  program  is  a  formal  text,  describing  precisely  the  actions  of  a 
computer  executing  that  program.  As  in  other  branches  of  engineering,  the  progress  of  its  implementation  as 
well  as  its  eventual  quality  can  be  promoted  by  additional  design  documents,  formalized  before  starting  to  write 
the  final  code.  These  preliminary  documents  may  be  expressed  in  a  variety  of  notations  suitable  for  different  pur¬ 
poses  at  different  stages  of  a  project,  from  capture  of  requirements  through  design  and  implementation,  to 
delivery  and  long-term  maintenance.  These  notations  are  derived  from  mathematics,  and  include  algebra,  logic, 
functions,  and  procedures.  The  connection  between  the  notations  is  provided  by  mathematical  calculation  and 
proof. 

This  article  introduces  and  illustrates  a  selection  of  formal  methods  by  means  of  a  single  recurring  exam¬ 
ple,  the  design  of  a  program  to  compute  the  greatest  common  divisor  of  two  positive  numbers.  It  is  hoped  that 
some  of  the  conclusions  drawn  from  analysis  of  this  simple  example  will  apply  with  even  greater  force  to  software 
engineering  projects  on  a  more  realistic  scale. 
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[Hodg76]  Abbreviated  Introduction:  The  production  of  consistently  executable  and  dependable  software 
demands  a  thoughtful  systematic  implementation  -  with  clear  documentation  at  each  production  stage.  Recog¬ 
nizing  this,  the  Data  Systems  Laboratory,  at  Marshall  Space  Flight  Center,  NASA,  began  a  research  effort  to 
help  discover  and  institute  sound  engineering  principles  into  a  methodology  for  the  production  of  software. 

Achieving  this  end  demanded,  among  other  things,  the  development  of  a  formal  specifications  language 
that  could  traceably  embody  requirements,  a  high  level  programming  language  that  could  be  generated  easily  and 
faithfully  from  specifications  and  could  promote  a  logical  error-free  code  implementation,  a  language  preproces¬ 
sor  to  allow  compatibility  of  the  methodology  with  existing  compilers  and  finally,  automatic  code  analysis  tools  to 
attain  our  original  objective-reducing  software  test  and  verification  effort. 

Such  a  methodology,  an  integrated  Software  Specification  and  Evaluation  System  (SSES),  is  being 
developed  for  NASA/MSFC.  [This  paper  presents]  the  technical  highlights  and  unification  of  the  system. 

[Holz82]  Abstract:  This  paper  introduces  a  simple  algebra  for  the  validation  of  communication  protocols  in  mes¬ 
sage  passing  systems.  The  behavior  of  each  process  participating  in  a  communication  is  first  modeled  in  a  finite 
state  machine.  The  symbol  sequences  that  can  be  accepted  by  these  machines  are  then  expressed  in  “protocol 
expressions,”  which  are  defined  as  regular  expressions  extended  with  two  new  operators:  division  and  multipli¬ 
cation.  The  interactions  of  the  machines  can  be  analyzed  by  combining  protocol  expressions  via  multiplication 
and  algebraically  manipulating  the  terms. 

The  method  allows  for  an  arbitrary  number  of  processes  to  participate  in  an  interaction.  In  many  cases  an 
analysis  can  be  performed  manually,  in  other  cases  the  analysis  can  be  automated.  The  method  has  been  applied 
to  a  number  of  realistic  protocols  with  up  to  seven  interacting  processes. 

An  automated  analyzer  was  written  in  the  language  C.  The  execution  time  of  the  automated  analysis  is  in 
most  cases  limited  to  a  few  minutes  of  CPU  time  on  a  PDP 11/70  computer. 

[Hous77]  Abstract:  A  tool  for  the  systematic  production  of  test  cases  for  a  compiler  is  first  presented.  The  input 
of  the  generator  are  formal  grammers,  derived  from  the  definition  of  the  reference  language.  This  tool  has  been 
applied  to  the  generation  of  test  programs  for  Algol  68.  For  each  construction  which  the  language  possesses,  the 
syntactic  structure  of  the  corresponding  test  and  the  semantic  verifications  it  contains  are  given.  The  test  set  has 
begun  to  be  employed  on  a  specific  implementation.  Discovered  errors  related  to  Algol  68  constructions  are 
analyzed. 

[Howd75a]  Abstract:  A  methodology  for  generating  program  test  data  is  described.  The  methodology  is  a  model 
of  the  test  data  generation  process  and  can  be  used  to  characterize  the  basic  problems  of  test  data  generation.  It 
is  well  defined  and  can  be  used  to  build  an  automatic  test  data  generation  system. 

The  methodology  decomposes  a  program  into  a  finite  set  of  classes  of  paths  in  such  a  way  that  an  intui¬ 
tively  complete  set  of  test  cases  would  cause  the  execution  of  one  path  in  each  class.  The  test  data  generation 
problem  is  theoretically  unsolvable:  there  is  no  algorithm  which,  given  any  class  of  paths,  will  either  generate  a 
test  case  that  causes  some  path  in  that  class  to  be  followed  or  determine  that  no  such  data  exist.  The  methodol¬ 
ogy  attempts  to  generate  test  data  for  as  many  of  the  classes  of  paths  as  possible.  It  operates  by  constructing 
descriptions  of  the  input  data  subsets  which  cause  the  classes  of  paths  to  be  followed.  It  transforms  these 
descriptions  into  systems  of  predicates  which  it  attempts  to  solve. 

[Howd76c]  Abstract:  A  set  of  test  data  T  for  a  program  P  is  reliable  if  it  reveals  that  P  contains  an  error  when¬ 
ever  P  is  incorrect.  If  a  set  of  tests  T  is  reliable  and  P  produces  correct  output  for  each  element  of  T  then  P  is  a 
correct  program.  Test  data  generation  strategies  are  procedures  for  generating  sets  of  test  data.  A  testing  strategy 
is  reliable  for  a  program  P  if  it  produces  a  reliable  set  of  test  data  for  P.  It  is  proved  that  an  effective  testing  stra¬ 
tegy  which  is  reliable  for  all  programs  cannot  be  constructed.  A  description  of  the  path  analysis  testing  strategy  is 
presented.  In  the  path  analysis  strategy  data  are  generated  which  cause  different  paths  in  a  program  to  be  exe¬ 
cuted.  A  method  for  analyzing  the  reliability  of  path  testing  is  introduced.  The  method  is  used  to  characterize 
certain  classes  of  programs  and  program  errors  for  which  the  path  analysis  strategy  is  reliable.  Examples  of  pub¬ 
lished  incorrect  programs  are  included. 
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[Howd76e]  Abstract:  Symbolic  evaluation  techniques  can  be  used  to  determine  the  cumulative  effects  of  a  pro¬ 
gram’s  calculations  on  the  branching  predicates  and  output  variables  in  the  program.  If  the  evaluation  techniques 
are  carefully  and  selectively  applied,  they  can  be  used  to  generate  revealing  symbolic  represei.  ations  of  the  com¬ 
putations  carried  out  by  the  paths  in  a  program,  and  of  the  systems  of  predicates  that  describe  the  input  data  that 
causes  program  paths  to  be  executed.  A  symbolic  evaluation  system  called  DISSECT  is  described  which  can  be 
used  to  analyze  FORTRAN  programs.  The  system  includes  a  sophisticated  command  language  that  allows  the 
user  to  selectively  apply  symbolic  evaluation  techniques  to  different  program  paths  and  subpaths.  The  command 
language  allows  the  user  to  carry  out  different  levels  of  symbolic  testing  of  a  program  and  to  construct  systems  of 
predicates  that  can  be  used  to  automate  the  generation  of  numeric  test  data.  Experiments  with  the  system  which 
illustrate  its  advantages  and  limitations  are  included.  DISSECT  can  be  used  to  carry  out  a  systematic,  docu¬ 
mented  reliability  analysis  of  a  program.  The  paper  concludes  with  a  discussion  of  the  potential  use  of  systems 
like  DISSECT  as  the  basic  software  certification  tool  in  the  software  development  process. 

[Howd77a]  Abstract:  The  report  is  divided  into  two  parts.  The  first  part  contains  a  study  of  the  design  of  sym¬ 
bolic  evaluation  systems.  It  also  contains  an  estimate  of  the  costs  of  using  such  a  system  to  carry  out  symbolic 
program  testing.  The  second  part  contains  a  study  of  the  effectiveness  of  symbolic  testing.  It  contains  an  analysis 
of  the  circumstances  under  which  symbolic  testing  is  reliable  for  discovering  program  bugs.  The  effectiveness  of 
symbolic  testing  is  compared  with  other  reliability  analysis  techniques.  The  analysis  of  the  effectiveness  of  sym¬ 
bolic  testing  which  is  contained  in  Part  2  is  based  on  the  study  of  six  programs.  Descriptions  of  the  programs  and 
the  details  on  the  analyses  are  continued  in  the  six  Appendices. 

[Howd77b]  Abstract:  Symbolic  testing  and  a  symbolic  evaluation  system  called  DISSECT  are  described.  The 
principle  features  of  DISSECT  are  outlined.  The  results  of  two  classes  of  experiments  in  the  use  of  symbolic 
evaluation  are  summarized.  Several  classes  of  program  errors  are  defined  and  the  reliability  of  symbolic  testing  in 
finding  bugs  is  related  to  the  classes  of  enors.  The  relationship  of  symbolic  evaluation  systems  like  DISSECT  to 
classes  of  program  errors  and  to  other  kinds  of  program  testing  and  program  analysis  tools  is  also  discussed. 
Desirable  improvements  in  DISSECT,  whose  importance  was  revealed  by  the  experiments,  are  mentioned. 

[Howd77c]  Summary:  The  effectiveness  in  discovering  errors  of  symbolic  evaluation  and  of  testing  and  static 
program  analysis  are  studied.  The  three  techniques  are  applied  to  a  diverse  collection  of  programs  and  the  results 
compared.  Symbolic  evaluation  is  used  to  carry  out  symbolic  testing  and  to  generate  symbolic  systems  of  path 
predicates.  The  use  of  the  predicates  for  automated  test  data  selection  is  analyzed.  Several  conventional  types  of 
program  testing  strategies  are  evaluated.  The  strategies  include  branch  testing,  structured  testing  and  testing  on 
input  values  having  special  properties.  The  static  source  analysis  techniques  that  are  studied  include  anomaly 
analysis  and  interface  analysis. 

Examples  are  included  which  describe  typical  situations  in  which  one  technique  is  reliable  but  another 
unreliable.  The  effectiveness  of  symbolic  testing  is  compared  with  testing  on  actual  data  and  with  the  use  of  an 
integrated  methodology  that  includes  both  testing  and  static  source  analysis.  Situations  in  which  symbolic  testing 
is  difficult  to  apply  or  not  effective  are  discussed.  Different  ways  in  which  symbolic  evaluation  can  be  used  for 
generating  test  data  are  described.  Those  ways  for  which  it  is  most  effective  are  isolated.  The  paper  concludes 
with  a  discussion  of  the  most  effective  uses  to  which  symbolic  evaluation  can  be  put  in  an  integrated  system 
which  contains  all  three  of  the  validation  techniques  that  are  studied. 

[Howd78a]  Abstract:  Two  approaches  to  the  study  of  program  testing  are  described.  One  approach  is  theoretical 
and  the  other  empirical.  In  the  theoretical  approach  situations  are  characterized  in  which  it  is  possible  to  use 
testing  to  formally  prove  the  correctness  of  programs  or  the  correctness  of  properties  of  programs.  In  the  empiri¬ 
cal  approach  statistics  are  collected  which  record  the  frequency  with  which  different  testing  strategies  reveal  the 
errors  in  a  collection  of  programs.  A  summary  of  the  results  of  two  research  projects  which  investigated  these 
approaches  are  presented.  The  difference  between  the  two  approaches  are  discussed  and  their  relative  advan¬ 
tages  and  disadvantages  are  compared. 
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[Howd78b]  Summary:  An  approach  to  the  study  of  program  testing  is  introduced  in  which  program  testing  is 
treated  as  a  special  kind  of  equivalence  problem.  In  this  approach,  classes  of  programs  P*  and  associated  classes 
of  test  sets  T*  are  defined  which  have  the  property  that  if  two  programs  P  and  QinP*  agree  on  a  set  of  tests  from 
T*,  then  P  and  Q  are  computationally  equivalent.  The  properties  of  a  class  P*  and  the  associated  class  T*  can  be 
thought  of  as  defining  a  set  of  assumptions  about  a  hypothetical  correct  version  Q  of  a  program  P  in  P*.  If  the 
assumptions  are  valid  then  it  is  possible  to  prove  the  correctness  of  P  by  testing.  The  main  result  of  the  paper  is 
an  equivalence  theorem  for  classes  of  programs  which  carry  out  sequences  of  computations  involving  the  ele¬ 
ments  of  arrays. 

[Howd78c]  Abstract:  The  use  of  traces  proving  properties  of  programs  is  investigated.  Two  kinds  of  traces  are 
studied.  The  first,  called  “value  traces”  contain  intermediate  values  of  program  variables.  A  theorem  is 
presented  which  can  be  used  to  verify  the  computations  which  generate  the  values  appearing  in  a  value  trace.  The 
second  kind  of  trace,  called  a  “symbolic  trace”  contains  the  unevaluated  sequence  of  assignment  statements  and 
branch  predicates  that  occur  along  a  program  path.  A  special  class  of  symbolic  traces  called  “elementary  traces” 
is  defined.  A  theorem  is  presented  which  proves  that  if  the  elementary  symbolic  traces  of  a  program  are  correct 
then  all  of  its  symbolic  traces  are  correct.  The  correctness  of  the  set  of  symbolic  traces  for  a  program  implies  the 
correctness  of  the  program. 

[Howd78d]  Abstract:  This  short  paper  summarizes  the  different  approaches  to  a  theory  of  program  testing.  The 
goals  of  a  theory  of  testing  and  several  of  the  results  which  have  been  achieved  are  described. 

[Howd78f]  Abstract:  In  recent  years  a  number  of  research  projects  have  been  completed  which  have  attempted 
to  assess  the  effectiveness  of  different  software  validation  methods.  The  results  of  those  projects  are  summarized 
and  the  effectiveness  of  the  different  methods  compared. 

[Howd80a]  Abstract:  An  approach  to  functional  testing  is  described  in  which  the  design  of  a  program  is  used  to 
generate  functional  test  data.  The  approach  depends  on  the  use  of  design  methods  that  model  the  abstract  func¬ 
tional  structure  of  a  program  as  well  as  the  abstract  structure  of  the  data  on  which  the  program  operates.  An 
example  of  the  use  of  the  method  is  given  and  a  discussion  of  its  effectiveness. 

[Howd80b]  Abstract:  Error  analysis  involves  the  examination  of  a  collection  of  programs  whose  errors  are 
known.  Each  error  is  analyzed  and  validation  techniques  which  would  discover  the  error  are  identified.  The 
errors  that  were  present  in  version  five  of  a  package  of  Fortran  scientific  subroutines  and  then  later  corrected  in 
version  six  were  analyzed.  An  integrated  collection  of  static  and  dynamic  analysis  methods  would  have 
discovered  the  errors  in  version  five  before  its  release.  An  integrated  approach  to  validation  and  the  effective¬ 
ness  of  individual  methods  are  discussed. 

[Howd80c]  Abstract:  An  approach  to  functional  testing  is  described  in  which  the  design  of  a  program  is  viewed 
as  an  integrated  collection  of  functions.  The  selection  of  test  data  depends  on  the  functions  used  in  the  design 
and  on  the  value  spaces  over  which  the  functions  are  defined.  The  basic  ideas  in  the  method  were  developed  dur¬ 
ing  the  study  of  a  collection  of  scientific  programs  containing  errors.  The  method  was  the  most  reliable  testing 
technique  for  discovering  the  errors.  It  was  found  to  be  significantly  more  reliable  than  structural  testing.  The 
two  techniques  are  compared  and  their  relative  advantages  and  limitations  are  discussed. 

[Howd80d]  Abstract:  Program  testing  metrics  are  based  on  criteria  for  measuring  the  completeness  of  a  set  of 
program  tests.  Branch  testing  measures  the  percentage  of  program  branches  that  are  traversed  during  a  set  of 
tests.  Mutation  testing  measures  the  ability  of  a  set  of  tests  to  distinguish  a  program  from  similar  programs.  A  cri¬ 
terion  for  test  completeness  is  introduced  in  this  paper  which  measures  the  ability  of  a  set  of  tests  to  distinguish 
between  functions  which  are  implemented  by  parts  of  programs.  The  criterion  is  applied  to  functions  which  are 
implemented  by  different  kinds  of  programming  language  statements.  It  is  more  effective  than  branch  testing  and 
incorporates  some  of  the  advantages  of  mutation  testing.  Its  effectiveness  can  be  discussed  formally  and  it  can  be 
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described  as  part  of  an  integrated  approach  to  testing.  A  tool  can  be  used  to  implement  the  method. 

[Howd81b]  Abstract:  The  term  “static  analysis”  has  traditionally  been  used  to  refer  to  program  analyris  methods 
that  assist  the  user  in  verifying  his  program,  but  which  do  not  require  its  execution.  Static  analysis  includes  tech¬ 
niques  which  produce  general  information  about  a  program,  such  as  cross-reference  tables,  as  well  as  techniques 
which  search  for  particular  types  of  errors,  such  as  uninitialized  variables.  This  survey  describes  both  traditional 
static  analysis  methods  as  well  as  other  validation  methods  that  do  not  require  program  execution.  It  includes 
techniques  that  involve  the  analysis  of  system  documents  other  than  the  program  code,  such  as  requirements  and 
design  analysis.  It  also  includes  code  analysis  techniques  such  as  symbolic  evaluation. 


[Howd81c]  Abstract:  A  scheme  for  classifying  program  testing  methods  is  introduced.  Program  testing  methods 
are  classified  according  to  whether  they  involve  the  generation  of  test  data  which  is  based  on  the  requirements 
specifications,  the  design  specifications  or  the  source  code  for  a  program.  Detailed  descriptions  of  functional 
requirements  and  functional  design  based  testing  are  included.  Source  code  methods  which  are  described 
include  branch  testing,  path  testing,  file  testing  and  expression  testing.  Other  dynamic  analysis  techniques,  such 
as  dynamic  assertions  and  recovery  control  blocks  are  also  described. 
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[Howd82a]  Abstract:  Different  approaches  to  the  generation  of  test  data  are  described.  Error-based  approaches 
depend  on  the  definition  c  f  classes  of  commonly  occurring  program  errors.  They  generate  tests  which  are  specifi¬ 
cally  designed  to  determine  if  particular  classes  of  errors  occur  in  a  program.  An  error-based  method  called  weak 
mutation  testing  is  described.  In  this  method,  tests  are  constructed  which  are  guaranteed  to  force  program  state¬ 
ments  which  contain  certain  classes  of  errors  to  act  incorrectly  during  the  execution  of  the  program  over  those  I 
tests.  The  method  is  systematic,  and  a  tool  can  be  built  to  help  the  user  apply  the  method.  It  is  extensible  in  the  | 
sense  that  it  can  be  extended  to  cover  additional  classes  of  errors.  Its  relationship  to  other  software  testing 
methods  is  discussed.  Examples  are  included. 

Different  approaches  to  testing  involve  different  concepts  of  the  adequacy  or  completeness  of  a  set  of 
tests.  A  formalism  for  characterizing  the  completeness  of  test  sets  that  are  generated  by  error-based  methods 
such  as  weak  mutation  testing  as  well  as  the  test  sets  generated  by  other  testing  methods  is  introduced.  Error- 
based,  functional,  and  structural  testing  emphasize  different  approaches  to  the  test  data  generation  problem.  The 
formalism  which  is  introduced  in  the  paper  can  be  used  to  describe  their  common  basis  and  their  differences. 


[Howd82b]  Introduction:  The  software  life  cycle  can  be  divided  into  requirements,  design,  programming,  and 
maintenance.  Validation  has  also  been  considered  a  phase  of  the  life  cycle  and  is  sometimes  inserted  between 
programming  and  maintenance.  Recent  experience,  however,  indicates  that  validation  should  be  integrated  into 
all  phases  rather  than  isolated  in  a  separate  stage  that  takes  place  long  after  requirements  and  design  have  been 
completed.  Studies  show  that  the  later  validation  is  carried  out,  the  more  expensive  it  becomes  to  find  errors 
made  early  in  the  development  process. 

In  the  integrated  approach  described  in  this  article,  validation  is  a  part  of  each  phase  of  the  life  cycle.  Two 
validation  activities-analysis  and  test  data  generation-take  place  during  each  phase.  The  programming  and 
maintenance  phases  also  include  actual  examination  of  program  tests.  Analysis  involves  the  direct  examination 
of  specification  and  code  for  errors  or  erroneous  properties.  Test  data  generation  involves  the  construction  of 
test  sets  that  are  based  on  the  important  functional  properties  of  specification  and  code. 


[Howd85]  Introduction:  Program  testing  consists  of  a  scattered  collection  of  rules  of  thumb,  coverage  measures 
and  testing  philosophies.  Several  attempts  have  been  made  to  construct  theories  to  explain  why  testing  works  and 
to  isolate  classes  of  faults  that  can  be  consistently  remedied  by  certain  methods. 

In  one  approach  to  testing,  called  functional  testing,  a  collection  of  methods  are  integrated  that  can  be 
described  from  the  same  point  of  view  and  whose  effectiveness  can  be  analyzed  using  a  common  theory. 

The  functional  testing  methods  described  here  are  suitable  for  module  and  integration  testing  at  the 
development  stage.  The  focus  is  on  functional  faults  -  errors  caused  by  a  program  that  computes  the  wrong  func¬ 
tion  -  rather  than  timing  or  performance  problems.  Most  of  the  ideas  are  not  original,  but  show  how  it  all  fits 
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together.  The  discussion  is  informal  and  no  attempt  is  made  to  present  the  theory  in  formal  definitions  and 
theorems. 

[Howd86]  Abstract:  An  integrated  approach  to  testing  is  described  which  includes  both  static  and  dynamic 
analysis  methods  and  which  is  based  on  theoretical  results  that  prove  both  its  effectiveness  and  efficiency.  Pro¬ 
grams  are  viewed  as  consisting  of  collections  of  functions  that  are  joined  together  using  elementary  functional 
forms  or  complex  functional  structures. 

Functional  testing  is  identified  as  the  input-output  analysis  of  functional  forms.  Classes  of  faults  are 
defined  for  these  forms  and  results  presented  which  prove  the  fault  revealing  effectiveness  of  well  defined  sets  of 
tests. 

Functional  analysis  is  identified  as  the  analysis  of  the  sequences  of  operators,  functions,  and  data  type 
transformations  which  occur  in  functional  structures.  Functional  trace  analysis  involves  the  examination  of  the 
sequences  of  function  calls  which  occur  in  a  program  path;  operator  sequence  analysis  the  examination  of  the 
sequences  of  operators  on  variables,  data  structures,  and  devices;  and  data  type  transformation  analysis  the 
examination  of  the  sequences  of  transformations  on  data  types.  Theoretical  results  are  presented  which  prove 
that  it  is  only  necessary  to  look  at  interfaces  between  pairs  of  operators  and  data  type  transformations  in  order  to 
detect  the  presence  of  operator  or  data  type  sequencing  errors.  The  results  depend  on  the  definition  of  normal 
forms  for  operator  and  data  type  sequencing  diagrams. 

[Howd87]  Abbreviated  Preface:  This  book  presents  an  integrated  approach  to  program  testing  and  analysis 
which  has  a  sound  mathematical  basis.  It  describes  both  previous  techniques,  and  how  they  fit  together,  as  well  as 
new  methods.  It  provides  a  general  approach  to  testing  and  validation  that  incorporates  all  important  software 
life  cycle  products,  including  requirements  and  general  and  detailed  designs.  The  results  can  be  used  to  prove 
that  well-defined  classes  of  faults  and  failures  will  b  liscovered  by  specific  techniques.  Functional  testing  and 
analysis  is  a  general  approach  to  verification  and  validation  and  not  only  integrates  current  techniques,  but  indi¬ 
cates  fruitful  directions  for  continued  research  and  development. 

[Howd89a]  Overview:  This  paper  presents  a  brief  overview  of  the  author’s  recent  and  current  work.  The  main 
topics  address  the  development  of  a  general  model  of  bow  software  is  constructed  and  of  the  reasoning  errors 
that  humans  make  during  this  process,  and  flavor  analysis. 

[Howe84]  Abstract:  In  the  areas  of  software  development,  data  processing  management  often  focuses  more  on 
coding  techniques  and  system  architecture  than  on  how  to  manage  the  development.  In  recent  years,  “structured 
programming”  and  “structured  analysis”  have  received  more  attention  than  the  techniques  software  managers 
employ  to  manage.  Moreover,  these  coding  and  architecture  considerations  are  often  advanced  as  the  key  to  a 
smooth  running,  well  managed  project. 

This  paper  documents  a  philosophy  for  software  development  and  the  tools  used  to  support  it.  Those 
management  techniques  deal  with  quantifying  such  abstract  terms  as  “productivity,”  “performance,”  and  “pro¬ 
gress,”  and  with  measuring  these  quantities  and  applying  managemenf  controls  to  maximize  them.  The  paper 
also  documents  the  applications  of  these  techniques  on  a  major  software  development  effort. 

[Hsie89]  Abstract:  An  approach  to  timing  analysis  of  cyclic  concurrent  programs  is  presented.  GR0  path- 
expressions  are  used  to  describe  synchronization  and  concurrency  of  atomic  operations  in  cyclic  concurrent  pro¬ 
grams.  The  behavior  of  a  cyclic  concurrent  program  is  represented  as  a  partial  order  of  atomic  operations,  and  a 
technique  to  derive  this  partial  order  from  a  GRQ  program  is  developed.  Given  the  execution  times  of  the  indivi¬ 
dual  atomic  operations  of  a  GR0  program  and  a  set  of  timing  constraints,  our  timing  analysis  technique  uses  the 
partial  order  to  determine  whether  the  concurrent  program,  when  executed,  will  satisfy  the  set  of  timing  con¬ 
straints.  The  timing  analysis  technique  can  be  completely  automated. 

[Huan75]  Abstract:  One  of  the  practical  methods  commonly  used  to  detect  the  presence  of  errors  in  a  computer 
program  is  to  test  it  for  a  set  of  test  cases.  The  probability  of  discovering  errors  through  testing  can  be  increased 
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by  selecting  test  cases  in  such  a  way  that  each  and  every  branch  in  the  flowchart  will  be  traversed  at  least  once 
during  the  test.  This  tutorial  describes  the  problems  involved  and  the  methods  that  can  be  used  to  satisfy  the  test 
requirement. 

[Huan78]  Introduction:  It  is  well  known  that  one  cannot  find  all  the  errors  in  a  program  simply  by  testing  it  for  a 
set  of  input  data.  Nevertheless,  program  testing  is  the  most  commonly  used  technique  for  error  detection  in 
today’s  software  industry.  Consequently,  the  problem  of  finding  a  program  test  method  with  increased  error- 
detection  capability  has  received  considerable  attention  in  the  field  of  software  research. 

There  appear  to  be  two  major  approaches  to  this  problem.  One  is  to  find  better  criteria  for  test-case  selec¬ 
tion.  The  other  is  to  find  a  way  to  obtain  additional  information  (i.e.,  information  other  than  that  provided  by  the 
output  of  the  program)  that  can  be  used  to  detect  errors. 

The  technique  of  program  instrumentation  discussed  in  this  article  can  be  regarded  as  a  major  outgrowth 
of  the  second  approach.  The  main  idea  is  to  insert  additional  statements  (instruments)  into  the  program  to  be 
tested  for  the  purpose  of  computing  certain  program  attributes.  By  testing  (executing)  the  instrumented  program 
for  a  properly  chosen  set  of  test  cases,  we  will  be  able  to  obtain  the  values  of  the  program  attributes  automati¬ 
cally.  The  attribute  values  provide  us  with  additional  information  for  error  detection.  The  following  pages  illus¬ 
trate  the  utility  of  this  technique  and  explore  its  potential  as  a  tool  for  program  validation. 

[Huan79]  Abstract:  A  data  flow  anomaly  in  a  program  is  an  indication  that  a  programming  error  might  have 
been  committed.  This  paper  describes  a  method  for  detecting  such  an  anomaly  by  means  of  program  instrumen¬ 
tation.  The  method  is  conceptually  simple,  easy  to  use,  easy  to  implement  on  a  computer,  and  can  be  applied  in 
conjunction  with  a  conventional  program  test  to  achieve  increased  error-detection  capability. 

[Hump88]  Abbreviated  Introduction:  One  SEI  project  is  to  provide  the  Defense  Department  with  some  way  to 
characterize  the  capabilities  of  software-development  organizations.  The  result  is  this  software-process  maturity 
framework,  which  can  be  used  by  an  software  organization  to  assess  its  own  capabilities  and  identify  the  most 
important  areas  for  improvement. 

This  software-development  process-maturity  model  reasonably  represents  the  actual  ways  in  which 
software-development  organizations  improve.  It  provides  a  framework  for  assessing  these  organizations  and 
identifying  the  priority  areas  for  immediate  improvement.  It  also  helps  identify  those  places  where  advanced 
technology  can  be  most  valuable  in  improving  the  software-development  process. 

The  SEI  is  using  this  model  as  a  foundation  for  a  continuing  program  of  assessments  and  software  process 
development.  These  assessment  methods  have  been  made  public,  and  preliminary  data  is  now  available 

[Hutc83]  Abstract:  This  paper  examines  the  use  of  cluster  analysis  as  a  tool  for  system  modularization.  Several 
clustering  techniques  are  discussed  and  used  on  two  medium-size  systems  and  a  group  of  small  projects.  The 
small  projects  are  presented  because  they  provide  examples  (that  will  fit  into  a  paper)  of  certain  types  of 
phenomena.  Data  bindings  between  the  routines  of  the  system  provide  the  basis  for  the  bindings.  It  appears  that 
the  clustering  of  data  bindings  provides  a  meaningful  view  of  system  modularization. 

[IEEE83a]  Forward:  Software  engineering  is  an  emerging  field.  New  terms  are  continually  being  generated,  and 
new  meanings  are  being  adopted  for  existing  terms.  The  Glossary  of  Software  Engineering  Terminology  was 
undertaken  to  document  this  vocabulary.  Its  purpose  is  to  identify  terms  currently  used  in  software  engineering 
«.n  .  to  present  the  current  meanings  of  these  terms.  It  is  intended  to  serve  as  a  useful  reference  for  software 
engineers  and  for  those  in  related  fields  and  to  promote  clarity  and  consistency  in  the  vocabulary  of  software 
engineering.  It  is  recognized  that  software  engineering  is  a  dynamic  area;  thus  the  standard  will  be  subject  to 
appropriate  change  as  becomes  necessary. 

[IEEE83b]  Purpose:  The  purpose  of  this  standard  is  to  describe  a  set  of  basic  software  test  documents.  A  stand¬ 
ardized  test  document  can  facilitate  communication  by  providing  a  common  frame  of  reference  (for  example,  a 
customer  and  a  supplier  have  the  same  definition  for  a  test  plan).  The  content  definition  of  a  standardized  test 
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document  can  serve  as  a  completeness  checklist  for  the  associated  testing  process.  A  standardized  set  can  also 
provide  a  baseline  for  the  evaluation  of  current  test  documentation  practices.  In  many  organizations,  the  use  of 
these  documents  significantly  increases  the  manageability  of  testing.  Increased  manageability  results  from  the 
greatly  increased  visibility  of  each  phase  of  the  testing  process. 

This  standard  specifies  the  form  and  content  of  individual  test  documents.  It  does  not  specify  the  required 
set  of  test  documents.  It  is  assumed  that  the  required  set  of  test  documents  will  be  specified  when  the  standard  is 
applied.  Appendix  B  contains  an  example  of  such  a  set  specification. 

[IEEE83c]  Scope:  This  standard  provides  minimum  requirements  for  preparation  and  content  of  Software  Con¬ 
figuration  Management  (SCM)  Plans.  SCM  Plans  document  the  methods  to  be  used  for  identifying  software  pro¬ 
duct  items,  controlling  and  implementing  changes,  and  recording  and  reporting  change  implementation  status. 

This  standard  applies  to  the  entire  life  cycle  of  critical  software;  for  example,  where  failure  could  impact 
safety  or  cause  large  financial  or  social  losses.  For  noncritical  software,  or  for  software  already  developed,  a  sub¬ 
set  of  the  requirements  may  be  applied. 

This  standard  identifies  those  essential  items  that  shall  appear  in  all  Software  Configuration  Management 
Plans.  In  addition  to  those  items,  the  users  of  this  standard  are  encouraged  to  incorporate  additional  items  into 
the  plan,  as  appropriate,  to  satisfy  unique  configuration  management  needs,  or  to  modify  the  contents  of  specific 
sections  to  fully  describe  the  scope  and  magnitude  of  the  software  configuration  management  effort.  Where  this 
standard  is  invoked  for  a  project  engaged  in  producing  several  software  items,  the  applicability  of  the  standard 
shall  be  specified  for  each  of  the  software  product  items  encompassed  by  the  project. 

Examples  are  incorporated  into  the  text  of  this  standard  to  enhance  clarity  and  to  promote  understanding. 
Examples  are  either  explicitly  identified  as  such,  or  can  be  recognized  by  the  use  of  the  verb  may.  Examples  shall 
not  be  construed  as  mandatory  implementations. 

[IEEE84]  The  purpose  of  this  standard  is  to  provide  uniform,  minimum  acceptable  requirements  for  preparation 
and  content  of  Software  Quality  Assurance  Plans  (SQAP). 

In  considering  adoption  of  this  standard,  regulatory  bodies  should  be  aware  that  specific  application  of 
this  standard  may  already  be  covered  by  one  or  more  IEEE  or  ANSI  standards  documents  relating  to  quality 
assurance,  definitions,  or  other  matters.  It  is  not  the  purpose  of  IEEE  Std  730  to  supersede,  revise  or  amend 
existing  standards  directed  to  specific  industries  or  applications. 

This  standard  applies  to  the  development  and  maintenance  of  critical  software;  for  example,  where  failure 
could  impact  safety  or  cause  large  financial  or  social  losses.  For  non-critical  software,  or  for  software  already 
developed,  a  subset  of  the  requirements  of  this  standard  may  be  applied. 

The  existence  of  this  standard  should  not  be  construed  to  prohibit  additional  content  in  a  Software  Qual¬ 
ity  Assurance  Plan.  An  assessment  should  be  made  for  the  specific  software  product  item  to  assure  adequacy  of 
coverage.  Where  this  standard  is  invoked  for  a  project  engaged  in  producing  several  software  items,  the  applica¬ 
bility  of  the  standard  should  be  specified  for  each  of  the  software  product  items  encompassed  by  the  project. 

[IEEE88]  Scope:  This  standard  provides  a  i.iethodology  for  establishing  quality  requirements  aau  identifying, 
implementing,  analyzing  and  validating  software  quality  metrics.  This  methodology  applies  to  all  software  at  all 
phases  of  the  software  life  cycle.  As  a  standard,  this  methodology  is  mandatory,  though  not  exhaustive  in  its 
implementation  details. 

Software  quality  is  the  degree  to  which  software  possesses  a  desired  combination  of  attributes.  By  defini¬ 
tion,  software  quality  is  relative;  it  varies  from  system  to  system  as  requirements  vary.  Likewise,  the  set  of  appli¬ 
cable  metrics  used  to  measure  software  quality  varies  from  system  to  system.  For  this  reason,  this  standard  does 
not  prescribe  specific  metrics.  However,  the  appendices  include  examples  of  metrics  together  with  a  complete 
example  of  the  use  of  this  standard. 

A  software  quality  metric  is  a  function  whose  inputs  are  software  data  and  whose  output  is  a  single 
(numerical)  value  that  can  be  interpreted  as  the  degree  to  which  software  possesses  a  given  attribute  that  affects 
its  quality. 
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[Iann84]  Abstract:  A  set  of  criteria  is  proposed  for  the  comparison  of  software  reliability  models.  The  intention 
is  to  provide  a  logically  organized  basis  for  determining  the  superior  models  and  for  the  presentation  of  model 
characteristics.  It  is  hoped  that  in  the  future,  a  software  manager  will  be  able  to  easily  select  the  model  most  suit¬ 
able  for  his  requirements  from  among  the  preferred  ones. 

[Ibar82]  Abstract:  We  consider  a  simple  class  of  loop-free  programs  whose  instruction  repertoire  consists  of  x 
<-  0,  x  <-c,  x  <-cx,x  <-x/c,x  <-x  +  y,  x  <-x  -y,  skip/,  ttp(x,y)  then  skip/,  and  halt,  (.randy  are  integer  vari¬ 
ables,  c  is  a  positive  integer,  x/c  is  integer  division,  /  is  a  nonnegative  integer,  and  p(x,y )  is  a  predicate  of  the  form 
x  >  y,  x  >my,  x  -y,  r  !-y,  r  <-y,  or  r  <  y;  skip  /  causes  the  (/+l)st  instruction  following  the  current  instruction 
to  be  executed  next.)  We  show  that  the  equivalence  problem  for  this  class  is  decidable  in  2an2  time  ( N  -  sum  of 
the  sizes  of  the  programs  and  -  is  a  fixed  positive  constant).  The  bound  cannot  be  reduced  to  a  polynomial  in  N 
unless  P-NP.  In  fact,  we  have  the  following  rather  surprising  result:  The  equivalence  problem  for  programs  with 
one  input  variable  (which  also  serves  as  the  output  variable)  and  one  auxiliary  variable  using  only  instructions  x 
<-  2x,  x  <-  x/2,  and  x  <-  x  +  y  is  NP-hard. 

[Ingl86]  Abbreviated  Introduction:  Standard  measures  of  software  quality  have  been  set  up  for  AT&T  Bell 
Laboratories.  These  metrics  allow  a  software  project  to  be  followed  through  its  development,  controlled  intro¬ 
duction,  and  release  to  customers.  The  metrics  serve  both  project  and  corporate  management  needs.  For  project 
management,  they  allow  more  effective  management  of  development  effort,  and  they  help  ensure  a  fast  and  effec¬ 
tive  solution  to  problems  that  arise  at  any  stage.  For  corporate  management,  they  provide  a  vehicle  for  quantify-  j 
ing  the  overall  quality  of  software  development,  for  setting  quality  improvement  objectives,  and  for  tracking 
results.  In  particular,  the  metrics  provide  quantitative  information  on  the  number  of  faults,  normalized  so  that  j 
corporate  results  can  be  summarized  and  projects  of  different  size  can  be  compared;  the  responsiveness  of  sup-  ! 
port  organizations  in  resolving  problems;  and  the  impact  of  fixes  on  customers. 

[Isod87]  Introduction:  Debugging  programs  involves  repeating  several  steps:  executing  test  programs,  detecting 
errors,  investigating  their  causes,  correcting  them,  and  recompiling.  The  most  time-consuming,  difficult  step  is 
the  detection  and  investigation  of  errors. 

Several  researchers  have  tried  to  find  an  effective  way  to  represent  execution  behavior  for  program  debug¬ 
ging.  Cargill’s  Blit  debugger  uses  multiple  windows  that  simultaneously  show  program  execution  output,  interac¬ 
tion  with  the  debugger,  and  program  text.  Myers’  Incense  debugger  displays  the  variables’  data  structures,  but 
does  not  address  the  dynamic  features  of  program  execution.  Reiss’  Pecan  debugger  does  propose  dynamic 
presentation  of  the  control  flow  on  a  program  flowchart,  but  it  displays  character-based  data  just  like  traditional 
debugging  tools.  The  Balsa  and  program  visualization  systems  can  represent  data  in  figures  that  are  appropriate 
to  their  meaning.  However,  they  do  not  have  facilities  to  represent  control  flow  of  blocks  or  dataflow  among  vari¬ 
ables. 

Therefore,  these  systems  are  insufficient  for  debugging  programs  because  they  represent  only  a  part  of 
program  execution  behavior. 

We  have  developed  a  visual  debugger  for  Ada  programs,  the  Visual  and  Interactive  Programming  Support. 
VIPS  uses  graphics  to  show  the  static  and  dynamic  behavior  of  program  execution.  It  greatly  reduces  the  time 
and  effort  .a  program  debugging  because  it  helps  a  programmer  detect  and  localize  program  errors  more  easily 
than  with  traditional  tools. 

[Itak82]  Abstract:  Estimation  of  the  size  and  time  required  for  software  development  is  probably  the  most  diffi¬ 
cult  aspect  of  any  project.  Up  to  now,  most  estimates  have  been  done  subjectively  by  experts.  These  estimates 
are  often  inaccurate.  In  the  midst  of  development,  faulty  estimates  may  contribute  to  delays  and/or  excess 
expenses. 

In  the  last  several  years,  several  estimation  models  have  been  proposed,  most  of  which  were  models  to 
estimate  software  development  cost  (manpower).  These  models  used  program  size  as  a  variable.  However  at  the 
beginning  of  development,  when  estimations  are  made,  program  sizes  are  usually  uncertain  and  costs  (man¬ 
power)  are  equally  uncertain. 
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The  authors  developed  a  program-size  estimation  model  for  batch  programs  in  a  banking  system,  and  used 
the  model  in  an  actual  project.  Using  the  adapted  model,  estimation  errors  amounted  to  only  7  percent.  This  is 
much  better  than  the  accuracy  of  estimations  made  by  experts  in  the  field  (usually  about  10  percent  accuracy), 
and  indicates  that  objective  estimation  methods  can  be  derived  for  program-size. 

In  this  paper,  we  introduce  our  estimation  model  and  discuss  the  adaptation  of  that  model  for  a  specific 
project. 

[Ives83]  Abstract:  This  paper  critically  reviews  measures  of  user  information  satisfaction  and  selects  one  for 
replication  and  extension.  A  survey  of  production  managers  is  used  to  provide  additional  support  for  the  instru¬ 
ment,  eliminate  scales  that  are  psychometrically  unsound,  and  develop  a  standard  short  form  for  use  when  only 
an  overall  assessment  of  information  satisfaction  is  required  and  survey  time  is  limited. 

[Jach84]  Abstract:  The  occurrence  of  a  data  flow  anomaly  is  often  an  indication  of  the  existence  of  a  program¬ 
ming  error.  The  detection  of  such  anomalies  can  be  used  for  detecting  errors  and  to  upgrade  software  quality. 
This  paper  introduces  a  new,  efficient  algorithm  capable  of  detecting  anomalous  data  flow  patterns  in  a  program 
represented  by  a  graph.  The  algorithm  based  on  static  analysis  scans  the  paths  entering  and  leaving  each  node  of 
the  graph  to  reveal  anomalous  data  action  combinations.  An  algorithm  implementing  this  type  of  approach  was 
proposed  by  Fosdick  and  Osterweil  [Fosd76a].  Our  approach  presents  a  general  framework  which  not  only  fills  a 
gap  in  the  previous  algorithm,  but  also  provides  time  and  space  improvements. 

[Jack71]  Introduction:  In  April  1966  work  was  initiated  by  Martin  Marietta  Corporation  (MMC),  Denver  Divi¬ 
sion,  to  extend  the  role  of  an  airborne  computer  to  include  flight  controls  as  well  as  guidance  and  navigation 
computations.  This  project  is  one  of  several  significant  improvements  for  the  Titan  DIC  space  booster  which 
was  funded  by  the  Space  and  Missile  Systems  Organization  of  the  Air  Force.  The  new  Digital  Flight  Control  Sys¬ 
tem  (DFCS)  has  been  successfully  tested  in  four  (4)  Titan  D3C  missions. 

The  purpose  of  this  paper  is  to  describe  how  a  large  hybrid  computer  simulation  was  used  as  an  aid  to 
design  and  develop  the  DFCS  and  then  used  to  validate  the  resulting  DFCS  airborne  software. 

The  simulation  was  programmed  in  six  degrees-of-freedom  and  included  an  airborne  Univac  1824M  Mis¬ 
sile  Guidance  Computer  (MGC)  in  the  closed  loop.  Additional  computing  equipment  used  in  the  simulation 
included  three  (3)  EAI  8800  analog  computers,  an  EAI  8400  digital  computer  and  an  SDS  930  digital  computer. 
Flight  control  hardware  components  such  as  rate  gyros,  body  mounted  accelerometers,  and  hydraulic  actuators 
were  also  used  in  the  simulation. 

[Jaha86]  Abstract:  Two  important  characteristics  of  time-critical  systems  are:  the  requirement  to  satisfy 
stringent  timing  constraints,  and  the  need  to  guard  against  an  imperfect  execution  environment.  In  this  paper,  we 
formalize  the  safety  analysis  of  timing  properties  in  real-time  systems.  Our  analysis  is  based  on  a  formal  logic: 
RTL  (Real-Time  Logic)  which  is  especially  suitable  for  reasoning  about  the  timing  behavior  of  systems.  Given 
the  formal  specification  of  a  system  and  a  safety  assertion  to  be  analyzed,  our  goal  is  to  relate  the  safety  assertion 
to  the  systems  specification.  There  are  three  distinct  cases:  1)  the  safety  assertion  is  a  theorem  derivable  from 
the  systems  specification,  2)  the  safety  assertion  is  unsatisfiable  with  respect  to  the  systems  specification,  or  3) 
the  negation  of  the  safety  assertion  is  satisfiable  under  certain  conditions.  A  systematic  method  for  performing 
safety  analysis  will  be  presented. 

[Jalo89]  Abstract:  Specifications  are  means  to  define  formally  the  behavior  of  a  system  or  a  system  component. 
Completeness  is  a  desirable  property  for  specifications.  In  this  paper,  we  describe  a  system  that  tests  for  the 
completeness  of  axiomatic  specifications  of  abstract  data  types.  For  testing,  the  system  generates  a  set  of  test 
cases  and  an  implementation  of  the  data  type  from  the  specifications.  The  generated  implementation  is  such  that 
if  the  specifications  are  not  complete,  the  implementation  is  not  complete,  and  the  behavior  of  all  of  the 
sequences  of  valid  operations  on  the  data  type  is  not  defined.  This  implementation  is  tested  with  the  generated 
test  cases  to  detect  the  incompleteness  of  specifications.  The  system  is  implemented  on  a  VAX  system  running 
Unix. 
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[Jard83]  Abstract:  An  approach  to  testing  the  consistency  of  specifications  is  explored,  which  is  applicable  to 
the  design  validations  of  communication  protocols  and  other  cases  of  step-wise  refinement.  In  this  approach,  i 
testing  module  compares  a  trace  of  interactions  obtained  from  an  execution  of  the  refined  specification  (e.g.  the 
protocol  specification)  with  the  reference  specification  (e.g.  the  communication  service  specification). 

Non-determinism  in  reference  specifications  presents  certain  problems.  Using  an  extended  finite  state 
transition  model  for  the  specifications,  a  strategy  for  limiting  the  amount  of  non-determinacy  is  presented. 

An  automated  method  for  constructing  a  testing  module  for  a  given  reference  specification  is  discussed. 
Experience  with  the  application  of  this  testing  approach  to  the  design  of  a  Transport  protocol  and  a  distributed 
mutual  exclusion  algorithm  is  described. 

[Jefl85]  Abstract:  This  paper  reviews  a  study  on  programming  productivity  carried  out  to  (i)  confirm  previous 
published  results,  (ii)  explore  the  impact  of  the  changing  programming  environment  on  productivity,  and  (iii) 
examine  the  influence  on  productivity  of  organizational  factors.  Programming  productivity  data  were  collected 
from  17  organizations  and  analyzed  in  the  light  of  data  collected  some  4  years  earlier.  While  significant  technolog¬ 
ical  changes  were  observed  to  have  occurred  in  the  programming  environment,  the  results  of  the  later  study  were 
in  many  respects  almost  identical  to  those  obtained  earlier,  thus  validating  the  previous  study.  The  organizational 
variables  collected  revealed  a  very  strong  relationship  between  a  programmer’s  attitude  to  his  supervisor  and 
programming  productivity. 

[JeU72]  Introduction  and  Summary:  Software  reliability  study  was  initiated  by  Advanced  Information  Systems 
subdivision  of  McDonnell  Douglas  Astronautics  Company,  Huntington  Beach,  California,  to  conduct  research 
into  the  nature  of  the  software  reliability  problem  including  definitions,  contributing  factors  and  means  for  con¬ 
trol. 

Discrepancy  reports  which  originated  during  the  development  of  two  large-scale  real-time  systems  form 
two  separate  primary  data  sources  for  the  reliability  study.  A  mathematical  model,  descriptively  entitled  the 
De-Eutrophication  Process,  was  developed  to  describe  the  time  pattern  of  the  occurrence  of  discrepancies 
(errors).  This  model  has  been  employed  to  estimate  the  initial  (or  residual)  error  content  in  a  software  package 
as  well  as  to  estimate  the  time  between  discrepancies  at  any  phase  of  its  development.  A  means  of  predicting  mis¬ 
sion  success  on  the  basis  of  errors  which  occur  during  testing  are  described. 

Problems  in  categorizing  software  anomalies  are  described  and  the  special  area  of  the  genesis  of 
discrepancies  during  the  integration  of  modules  is  discussed.  Management  techniques  which  should  reduce  the 
number  of  software  anomalies  are  described. 

[John75]  Abstract:  The  report  contains  plans  for  a  complete  software  reliability  measurement  program  using 
both  manual  and  automatic  data  entry.  The  program  is  to  be  run  in  conjunction  with  SAMTEC  at  Vandenburg 
AFB  in  an  effort  to  establish  measurement  and  evaluation  criteria  for  the  advanced  systematic  techniques  for 
reliable  operational  software  (ASTROS)  project.  An  integral  part  of  that  project  is  the  implementation  and 
evaluation  of  structured  programming  techniques. 

Included  in  the  report  are  all  forms  necessary  to  describe  the  software  development  environment,  the 
hierarchy  and  size  of  programming  modules,  and  to  capture  any  significant  events  that  will  affect  programming 
and  test  while  they  are  in  progress.  Forms  and  instructions  for  their  use  for  manual  data  collection  are  included, 
as  are  descriptions  of  items  that  could  be  collected  automatically. 

[John79]  Summary:  Designers  and  implementors  of  high-level  language  translators  can,  with  relatively  little 
extra  effort,  greatly  facilitate  run-time  symbolic  debugging.  Practical  suggestions  are  presented,  based  on  experi¬ 
ences  gained  from  interfacing  several  compilers  with  a  run-time  debugging  system. 

[Jobn82b]  Abstract:  This  glossary  contains  291  definitions  of  terms  dealing  with  the  debugging  of  computer 
software.  The  list  includes  numerous  synonyms,  as  well  as  the  proper  names  of  debugging  systems  described  in 
the  open  literature.  Terms  and  definitions  have  been  obtained  from  various  sources:  the  software-engineering 
literature,  other  software-engineering  glossaries,  and  individual  contributions. 
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[Jobn83]  Abstract:  This  paper  deals  with  issues  that  have  emerged  as  a  result  of  a  successful  implementation  of  a 
source  level  symbolic  debugger  for  HP-1000  computer  systems.  By  analyzing  a  user’s  thought  processes  during  a 
debugging  session  we  created  a  powerful  and  easy  to  use  tool  for  program  analysis. 

[John84]  Abstract:  This  paper  describes  a  program  called  PROUST  which  does  online  analysis  and  understand¬ 
ing  of  Pascal  programs  written  by  novice  programmers.  PROUST  takes  as  input  a  program  and  a  non-algorithmic 
description  of  the  program  requirements,  and  finds  the  most  likely  mapping  between  the  requirements  and  the 
code.  This  mapping  is  in  essence  a  reconstruction  of  the  design  and  implementation  steps  that  the  programmer 
went  through  in  writing  the  program.  A  knowledge  base  of  programming  plans  and  strategies,  together  with 
common  bugs  associated  with  them,  is  used  in  constructing  this  mapping.  Bugs  are  discovered  in  the  process  of 
relating  plans  to  the  code;  PROUST  can  therefore  give  deep  explanations  of  program  bugs  by  relating  the  buggy 
code  to  its  underlying  intentions. 

[Jone78]  Overview:  Discussed  is  the  unit-of-measure  situation  in  programming.  An  analysis  of  common  units  of 
measure  for  assessing  program  quality  and  programmer  productivity  reveals  that  some  standard  measures  are  int¬ 
rinsically  paradoxical.  Lines  of  code  per  programmer-month  and  cost  per  defect  are  in  this  category.  Presented 
here  are  attempts  to  go  beyond  such  paradoxical  units  as  these.  Also  discussed  is  the  usefulness  of  separating 
quality  measurements  into  measures  of  defect  removal  efficiency  and  defect  prevention,  and  the  usefulness  of 
separating  productivity  measurements  into  work  units  and  cost  units. 

[Jone81]  Abstract:  Programming  productivity  has  become  a  significant  topic  for  a  number  of  the  world’s  indus¬ 
trial,  commercial,  governmental,  and  university  communities.  The  decade  from  1970  to  1980  witnessed  an  unpre¬ 
cedented  growth  in  computers  and  programming,  that  was  accompanied  by  unprecedented  problems  with  costs, 
quality,  schedules,  and  low  productivity.  Current  research  indicates  that  the  greatest  barrier  to  improved  produc¬ 
tivity  lies  in  the  enormous  costs  which  are  associated  with  programming  defect  removal  and  with  paperwork. 
Therefore  the  most  direct  strategy  for  improving  productivity  is  to  concentrate  on  methods  that  simplify  com¬ 
plexity,  improve  requirements  and  design,  minimize  paperwork,  and  reduce  errors.  However,  attempts  to  move 
toward  these  goals  reveal  underlying  problems  whose  solutions  will  require  the  application  of  concepts  from  dis¬ 
ciplines  outside  of  programming,  such  as  linguistics  and  perceptual  psychology.  Programming  is  becoming  a 
catalyst  that  has  the  potential  of  forming  new  and  synergistic  combinations  of  ideas. 

By  the  mid  1980’s,  software  is  no  longer  an  “outcast”  technology  regarded  as  inferior  by  older  sciences. 
Accurate  metrics  of  software  projects  and  the  new  engineering  principles  supporting  reusable  designs  and  reus¬ 
able  code  are  giving  software  a  new  professionalism.  The  advent  of  new  human  interface  techniques  derived 
from  object-oriented  programming  methods  may  be  leading  to  a  new  plateau,  in  which  human  control  and  usage 
of  complex  devices  is  more  natural  and  intuitive  than  at  any  time  in  history. 

This  tutorial  volume  on  productivity  issues  for  the  eighties  attempts  to  place  programming  in  context  with 
other  disciplines,  and  addresses  five  major  topics:  (1)  Programming  measurements,  (2)  programming  life-cycle 
analysis,  (3)  programming  requirements  and  design  methods,  (4)  programming  environments,  and  (5)  the  new 
science  of  software. 

[Joyc87a]  Abstract:  A  computerized  theraputic  radiation  machine  has  beer  blamed  in  incidents  that  have  led  to 
the  deaths  of  two  patients  and  serious  injuries  to  several  others.  The  deadly  medical  mystery  used  by  the  machine 
was  finally  traced  to  a  software  bug,  “Malfunction  54,”  named  after  the  message  displayed  on  the  operator  con¬ 
sole.  The  affair  is  seen  as  epitomizing  the  software  reliability  crisis  at  its  worst,  and  raises  the  thorny  legal  issue  of 
liability  for  personal  injuries  caused  by  defective  programs.  The  pending  lawsuits  over  the  malfunctioning 
machine  may  set  a  legal  precedent  that  could  affect  all  computer  users  and  vendors.  Ultimately,  such  cases  call 
into  question  our  increasing  dependence  on  computers  for  everything  from  banking  to  national  defense. 

[Joyc87b]  Abstract:  The  monitoring  of  distributed  systems  involves  the  collection,  interpretation,  and  display 
of  information  concerning  the  interactions  among  concurrently  executing  processes.  This  information  and  its 
display  can  support  the  debugging,  testing,  performance  evaluation,  and  dynamic  documentation  of  distributed 
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systems.  General  problems  associated  with  monitoring  are  outlined  in  this  paper,  and  the  architecture  of  a  gen¬ 
eral  purpose,  extensible,  distributed  monitoring  system  is  presented.  Three  approaches  to  the  display  of  process 
interactions  are  described:  textual  traces,  animated  graphical  traces,  and  a  combination  of  aspects  of  the  textual 
and  graphical  approaches.  The  roles  that  each  of  these  approaches  fulfill  in  monitoring  and  debugging  distri¬ 
buted  systems  are  identified  and  compared.  Monitoring  tools  for  collecting  communication,  statistics,  detecting 
deadlock,  controlling  the  non-deterministic  execution  of  distributed  systems,  and  for  using  protocol  specifica¬ 
tions  in  monitoring  are  also  described. 

Our  discussion  is  based  on  experience  in  the  development  and  use  of  a  monitoring  system  within  a  distri¬ 
buted  programming  environment  called  Jade.  Jade  was  developed  within  the  Computer  Science  Department  of 
the  University  of  Calgary  and  is  now  being  used  to  support  teaching  and  research  at  a  number  of  university  and 
research  organizations. 

[Kafb81]  Overview:  We  state  a  set  of  criteria  that  has  guided  the  development  of  a  metric  system  for  measuring 
the  quality  of  a  large-scale  software  product.  This  metric  system  uses  the  flow  of  information  within  the  system  as 
an  index  of  system  interconnectivity.  Based  on  this  observed  interconnectivity,  a  variety  of  software  metrics  can 
be  defined.  The  types  of  software  quality  features  that  can  be  measured  by  this  approach  are  summarized.  The 
data-flow  analysis  techniques  used  to  establish  the  paths  of  information  flow  are  explained  and  illustrated. 
Finally,  a  means  of  integrating  various  metrics  and  models  into  a  comprehensive  software  development  environ¬ 
ment  is  discussed.  This  possible  integration  is  explained  in  terms  of  the  Gandalf  system  currently  under  develop¬ 
ment  at  Camegie-Mellon  University. 

[Kafti85a]  Abstract:  In  this  paper  are  presented  the  results  of  a  study  in  which  several  production  software  sys¬ 
tems  are  analyzed  using  ten  software  metrics.  The  ten  metrics  include  both  measures  of  code  details,  measures  of 
structure,  and  combinations  of  the  two.  Historical  data  recording  the  number  of  errors  and  the  coding  time  of  j 
each  component  are  used  as  objective  measures  of  resource  expenditure  of  each  component.  The  metrics  are 
validated  by  showing:  (1)  the  metrics  singly  and  in  combination  are  useful  indicators  of  those  components  which 
require  the  most  resources,  (2)  clear  patterns  between  the  metrics  and  the  resources  expended  are  visible  when 
both  resources  are  accounted  for,  (3)  measures  of  structure  are  as  valuable  in  examining  software  systems  as 
measures  of  code  details,  and  (4)  the  choice  of  which,  or  how  many,  software  metrics  to  employ  in  practice  is 
suggested  by  measures  of  “yield”  and  “coverage”. 

[Kafu85b]  Abstract:  This  paper  reports  on  an  effort  to  relate  seven  different  software  quality  metrics  to  the  ! 
experience  of  maintenance  activities  performed  on  a  medium  size  data  base  system.  Three  different  versions  of 
the  data  base  system  that  evolved  over  a  period  of  three  years  were  analyzed  in  this  study.  A  major  revision  of  the 
data  base  system,  while  still  in  its  design  phase,  was  also  analyzed. 

The  results  of  this  study  indicate:  (1)  that  the  growth  in  system  complexity  as  determined  by  the  software 
metrics  agree  with  the  general  character  of  the  maintenance  tasks  performed  in  successive  versions;  (2)  the 
metrics  were  able  to  identify  the  improper  integration  of  functional  enhancements  made  to  the  system;  (3)  the 
complexity  values  of  the  system  components  as  indicated  by  the  metrics,  conform  well  to  an  intuitive  understand¬ 
ing  of  the  system  by  people  familiar  with  the  system;  and  (4)  an  analysis  of  the  redesigned  version  of  the  data  base 
system  showed  the  ••  efulness  of  software  metrics  in  the  (re)design  phase  by  revealing  a  poorly  structured  com¬ 
ponent  of  the  syst 

[Kafu88]  Abstract:  This  paper  reports  the  results  of  a  study  which  examined  the  relationship  between  a  collec¬ 
tion  of  software  metrics  and  the  development  data  (such  as  errors  and  coding  time)  of  three  commercially  pro¬ 
duced  software  systems.  The  software  metrics  include  both  measures  of  system  interconnectivity  and  measures 
of  system  code.  This  study  revealed  strong  relationships  between  the  metrics  and  the  development  data  when 
individual  components  were  aggregated  by  structure  (into  subsystems)  or  by  similarity  (into  groups).  The  subsys¬ 
tem  and  group  results  imply  that  research  and  application  of  metrics  can  guide  the  effective  application  of  project 
resources  by  identifying  those  groups  which,  for  example,  will  contain  a  disproportionately  large  fraction  of 
errors.  Finally,  the  study  showed  the  overall  utility  of  two  interconnectivity  metrics:  Henry  and  Kafura’s 
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information  flow  metric  and  McClure’s  invocation  metric.  This  result  is  significant  because  interconnectivity 
metrics  can  be  applied  early  in  the  life  cycle. 

[Kahn77]  Abstract:  The  concept  of  coroutine  or  process  is  useful  in  a  large  class  of  applications,  usually  involv¬ 
ing  incremental  generation  of  transformation  of  data.  We  present  a  language  based  on  a  clear  semantics  of  pro¬ 
cess  interaction,  which  facilitates  well-structured  programming  of  dynamically  evolving  networks  of  processes. 
These  networks  exhibit  the  same  input/output  behavior  whether  they  are  executed  sequentially  or  in  parallel. 
Sample  program  proofs  are  used  to  illustrate  the  benefits  of  the  language’s  simple  denotational  semantics.  The 
language  serves  also  to  clarify  the  relationships  between  coroutines,  call-by-need,  dynamic  data  structures  and 
parallel  computation. 

[Kant80]  Abstract:  In  this  paper  we  consider  the  application  of  the  recovery  block  concept  to  parallel  programs 
for  ensuring  increased  reliability  despite  the  presence  of  software  bugs.  The  basic  idea  of  this  technique  is  to 
include  standby  software  components  in  the  program  which  can  be  “switched  on”  in  case  the  active  component 
fails.  However,  before  this  could  be  done,  the  system  must  be  rolled  back  to  a  consistent  state.  One  of  the  goals 
in  this  rollback  is  to  avoid  undoing  a  large  amount  of  computation.  It  is  shown  that  the  process  interaction  must 
be  severely  constrained  in  order  to  achieve  this  goal.  Sufficient  conditions  for  limiting  the  rollback  in  a  system  of 
processes  communicating  via  monitors  is  also  presented. 

[Kapp88]  Abstract:  IDA  Paper  P-2028  documents  a  tool  that  can  facilitate  the  description  of  processes  for  the 
Strategic  Defense  System  (SDS)  and  Battle  Management/Command,  Control  and  Communications  (BM/C3) 
architectures.  The  process  descriptions  generated  by  this  tool  conform  to  the  Strategic  Defense  Initiative  Organ¬ 
ization  (SDI)  Architecture  Dataflow  Modeling  Technique  (SADMT). 

[Katz87]  Abstract:  Packages,  subprograms,  generics,  and  tasks  are  the  building  blocks  of  Ada  systems.  They 
can  combine  to  hide  information,  group  information,  isolate  dependencies,  and  create  reusable  pieces.  How¬ 
ever,  different  people  view  them  from  different  perspectives  for  different  purposes.  Therefore,  people  have  dif¬ 
ferent  expectations  when  they  discuss  modules  and  modularity.  In  this  paper,  we  describe  four  definitions  that 
divide  systems  into  program  blocks,  or  modules,  using  various  structural  criteria.  We  use  this  common  terminol¬ 
ogy  to  show  how  the  different  views  on  the  system  can  help  us  better  understand  the  modularity  of  the  system. 
Finally,  we  use  programs  from  studies  comparing  design  techniques  and  measuring  maintainability  to  show  how 
these  ideas  may  be  applied. 

[Kear86]  Abbreviated  Introduction:  Inappropriate  use  of  software  complexity  measures  can  have  large,  damag¬ 
ing  effects  by  rewarding  poor  programming  practices  and  demoralizing  good  programmers.  Software  complexity 
measures  must  be  critically  evaluated  to  determine  the  ways  in  which  they  can  best  be  used. 

[Keil87]  Abstract:  We  suggest  that  users  are  interested  solely  in  the  quality  of  prediction  which  can  be  obtained 
from  software  reliability  models.  Some  ways  of  analyzing  the  quality  of  predictions  are  proposed  and  several 
models  and  inference  procedures  are  compared  against  software  failure  data  sets.  We  conclude  that  some  predic¬ 
tions  are  extremely  poor,  notably  those  arising  from  ML  analysis  of  the  Jelinski-Moranda  model.  Others  are 
quite  good.  We  suggest  promising  areas  for  future  work. 

[Kell76]  Abstract:  Two  formal  models  for  parallel  computation  are  presented:  an  abstract  conceptual  model  and 
a  parallel-program  model.  The  former  model  does  not  distinguish  between  control  and  data  states.  The  latter 
model  includes  the  capability  for  the  representation  of  an  infinite  set  of  control  states  by  allowing  there  to  be 
arbitrarily  many  instruction  pointers  (or  processes)  executing  the  program.  An  induction  principle  is  presented 
which  treats  the  control  and  data  state  sets  on  the  same  ground.  Through  the  use  of  “place  variables,”  it  is 
observed  that  certain  correctness  conditions  can  be  expressed  without  enumeration  of  the  set  of  all  possible  con¬ 
trol  states.  Examples  are  presented  in  which  the  induction  principle  is  used  to  demonstrate  proofs  of  mutual 
exclusion.  It  is  shown  that  assertions-oriented  proof  methods  are  special  cases  of  the  induction  principle.  A 
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special  case  of  the  assertions  method,  which  is  called  parallel  place  assertions,  is  shown  to  be  incomplete.  A  for¬ 
malization  of  “deadlock”  is  then  presented.  The  concept  of  a  “norm”  is  introduced,  which  yields  an  extension, 
to  the  deadlock  problem,  of  Floyd’s  technique  for  proving  termination.  Also  discussed  is  an  extension  of  the 
program  model  which  allows  each  process  to  have  its  own  local  variables  and  permits  shared  global  variables. 
Correctness  of  certain  forms  of  implementation  is  also  discussed.  An  appendix  is  included  which  relates  this 
work  to  previous  work  on  the  satisfiability  of  certain  logical  formulas. 

[Kell83]  Abstract:  In  this  paper  the  design  diversity  approach  of  fault-tolerant  multi-version  software  as  a  com¬ 
plement  to  fault  avoidance  is  discussed  and  an  experiment  is  described  in  which  32  programmers  were  employed. 

It  was  found  that  formal  specification  languages,  while  showing  promise  for  the  future,  are  presently  very  difficult 
to  use  and  understand,  and  are  severely  limited  in  power.  Software  errors  encountered  in  the  experiment  were 
studied  and  classified.  The  increase  in  reliability  seen  in  multi-version  software  over  individual-version  software 
was  substantial;  it  was  even  possible  to  combine  three  faulty  versions  and  produce  a  combination  that  was  com¬ 
pletely  fault-tolerant. 

[KeU85a]  Abstract:  ADAMAT,  an  Ada  Measurement  and  Analysis  Tool,  provides  immediate  assistance  for  1) 
improving  the  quality  of  Ada  software,  and  2)  training  Ada  programmers.  The  underlying  metrics  framework  is 
hierarchical  based  on  the  McCall  metrics  framework,  tailored  to  the  Ada  language,  and  formally  defined  using 
Prolog.  The  automated  data  collection  component  is  automatically  generated  using  compiler  generation  tech¬ 
niques,  which  include  a  descriptive  technique  for  describing  pattern  matching  in  a  well-defined  language.  The  I 
quality  analysis  component,  based  on  the  formal  definition  of  the  metrics,  provides  users  with  interactive 
analysis  of  the  metric  data,  and  allows  users  to  step  through  the  Ada  metrics  hierarchy  to  pinpoint  problem 
areas. 


[KellSSb]  Abbreviated  Introduction:  This  paper  discusses  an  approach  to  creating  an  automated  metrics  tool  for 
measuring  the  level  of  adherence  to  software  quality  guidelines  in  Ada  source  code.  The  tool  provides  managers 
with  the  visibility  needed  to  control  the  application  of  quality  guidelines  on  software  programs. 

Software  quality  management  is  very  difficult,  because  the  quality  of  software  is  not  transparent;  its 
assessment  requires  a  thorough  review  of  specifications  and  code.  By  formalizing  software  quality  principles,  we 
can  develop  a  tool  that  helps  monitor  their  use. 

[Kemm80]  Abstract:  This  paper  gives  an  overview  of  the  Formal  Development  Methodology  (FDM)  and  the  Ina 
Jo  formal  specification  language.  FDM  is  an  integrated  methodology  for  the  design,  specification,  implementa¬ 
tion,  and  verification  of  software.  It  enforces  rigorous  connections  between  successive  stages  of  development. 
The  components  of  the  FDM  are  the  Ina  Jo  formal  specification  language,  the  specification  processor,  the 
interactive  theorem  prover,  and  a  verification  condition  generator. 

This  paper  gives  an  overview  of  each  of  the  components  and  discusses  how  each  fits  into  the  overall  verifi¬ 
cation  process.  Examples  of  the  different  constructs  in  the  specification  language  are  presented  as  well  as  a  sam¬ 
ple  two-level  formal  specification. 

[Kemm85a]  Abstract:  Formal  specification  and  verification  techniques  are  now  used  to  increase  the  reliability  of 
software  systems.  However,  these  approaches  sometimes  result  in  specifying  systems  that  cannot  be  realized  or 
that  are  not  usable.  This  paper  demonstrates  why  it  is  necessary  to  test  specifications  early  in  the  software  life 
cycle  to  guarantee  a  system  that  meets  its  critical  requirements  and  that  also  provides  the  desired  functionality. 
Definitions  to  provide  the  framework  for  classifying  the  validity  of  a  functional  requirement  with  respect  to  a  for¬ 
mal  specification  are  also  introduced.  Finally,  the  design  of  two  tools  for  testing  formal  specifications  is  dis¬ 
cussed. 

(Kemm86]  Abstract:  This  is  the  first  volume  of  the  final  report  of  a  verification  assessment  study  that  was  begun 
in  November  1984  and  lasted  for  approximately  nine  months.  The  final  report  consists  of  five  volumes.  This 
volume  contains  an  overview  of  the  study,  some  conclusions  that  were  formulated,  and  directions  for  future 
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research  efforts  in  formal  verification. 

The  main  goal  of  this  effort  was  a  technology  interchange  among  the  developers  of  four  established  verifi¬ 
cation  systems.  The  systems  investigated  were  i)  Affirm  (General  Electric  Company,  Schenectady,  New  York), 
ii)  FDM  (System  Development  Corporation  -  A  Burroughs  Company,  Santa  Monica,  California)  iii)  Gypsy  (the 
University  of  Texas  at  Austin,  Austin,  Texas),  and  iv)  Enhanced  HDM  (SRI  International,  Menlo  Park,  Califor¬ 
nia). 

There  was  some  comparative  work  on  examples,  but  the  main  idea  was  for  the  developers  to  learn  the 
details  of  each  other’s  system  as  a  basis  for  future  development.  It  would  have  been  interesting  and  informative 
to  look  at  other  systems,  but  time  did  not  allow  for  this. 

It  was  not  the  goal  of  this  study  to  rate  the  verification  systems  that  were  investigated.  It  was  also  not  the 
intent  of  the  study  to  justify  the  need  for  formal  specification  and  verification  systems  or  to  justify  the  necessity 
for  research  in  this  area. 

[Kem74a]  Abbreviated  Introduction:  Good  programming  cannot  be  taught  by  preaching  generalities.  The  way 
to  learn  to  program  well  is  by  seeing,  over  and  over,  how  real  programs  can  be  improved  by  the  application  of  a 
few  principles  of  good  practice  and  a  little  common  sense.  Practice  in  critical  reading  leads  to  skill  in  rewriting, 
which  in  turn  leads  to  better  writing. 

This  book  is  a  study  of  a  large  number  of  “real”  programs,  each  of  which  provides  one  or  more  lessons  in 
style.  We  discuss  the  shortcomings  of  each  example,  rewrite  it  in  a  better  way,  then  draw  a  general  rule  from  the 
specific  case.  The  approach  is  pragmatic  and  down-to-earth;  we  are  more  interested  in  improving  current  pro¬ 
gramming  practice  than  in  setting  up  an  elaborate  theory  of  how  programming  should  be  done. 

The  examples  we  give  are  all  in  Fortran  and  PL/I,  since  these  languages  are  widely  used  and  are  suffi¬ 
ciently  similar  that  a  reading  knowledge  of  one  means  that  the  other  can  also  be  read  well  enough.  (We  avoid 
complicated  constructions  in  either  language  and  explain  unavoidable  idioms  as  we  encounter  them.)  The  princi¬ 
ples  of  style,  however,  are  applicable  in  all  languages,  including  assembly  codes. 

[Kern74b]  Abstract:  Computer  programs  can  be  written  many  different  ways  and  still  achieve  the  same  effect. 
Until  recently,  programmers  have  had  little  reason  to  favor  one  method  of  expressing  code  over  another.  We  have 
come  to  learn,  however,  that  functionally  equivalent  programs  can  have  extremely  important  stylistic  differences. 

Good  programming  style  cuts  across  application  areas,  technique  and  language.  Programs  written  with 
good  style  are  easier  to  read  and  understand,  and  often  smaller  and  more  efficient,  than  those  written  badly.  Yet 
few  programmers  have  ever  been  taught  what  style  is,  as  we  can  see  from  even  cursory  inspection  of  their  code. 
Even  the  techniques  of  structured  programming  do  not  ensure  that  code  will  be  good;  “structured”  progiams  can 
be  just  as  bad  as  their  unstructured  counterparts. 

This  paper  is  a  survey  of  some  aspects  of  programming  style,  primarily  expression  and  structure,  showing 
by  examples  what  happens  when  principles  of  style  are  violated,  and  what  can  be  done  to  improve  programs.  To 
add  the  ring  of  truth  to  our  discussion,  the  examples  are  all  taken  verbatim  from  programming  textbooks. 

[Kernfil]  Table  of  Contents:  Filters.  Files.  Sorting.  Text  Patterns.  Editing.  Formatting.  Macro  Processing. 

[Kieb83]  Abstract:  Abstract  data  types,  and  in  particular  those  that  are  designed  to  provide  resources  for  use  by 
concurrently  executable  programs,  are  often  designed  to  be  used  only  in  certain  ways.  The  intended  constraints 
on  use  of  an  instance  of  such  a  type  can  be  expressed  in  two  principle  ways:  as  assertions  on  the  domain  of  the 
values  input  to  each  operator,  and  as  constraints  on  the  sequences  in  which  the  operators  of  the  type  can  be 
called  by  a  customer  process.  These  constraints  must  be  enforced  in  the  environment  in  which  an  instance  of  the 
type  is  used.  Nevertheless,  they  are  very  much  a  part  of  the  type  specification,  for  its  definition  is  not  complete, 
nor  can  the  consistency  of  its  representation  be  proved,  without  them. 

A  notation  is  provided  in  which  to  express  sequential  constraints,  which  are  here  called  access-right 
expressions.  It  is  suggested  that  these  expressions  should  be  declared  in  a  programming  language  that  supports 
the  definition  of  monitors  or  resource  managers.  Implications  for  the  proof  rules  of  monitors  are  discussed,  and 
suggestions  are  made  for  a  programming  language  implementation. 
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[Klng75a]  Abstract:  The  current  approach  for  testing  a  program  is,  in  principle,  quite  primitive.  Some  small 
sample  of  the  data  that  a  program  is  expected  to  handle  is  presented  to  the  program.  If  the  program  produces 
correct  results  for  the  sample,  it  is  assumed  to  be  correct.  Much  current  work  focuses  on  the  question  of  how  to 
choose  this  sample.  We  propose  that  a  program  can  be  more  effectively  tested  by  executing  it  “symbolically.” 
Instead  of  supplying  specific  constants  as  input  values  to  a  program  being  tested,  one  supplies  symbols.  The  nor¬ 
mal  computational  definitions  for  the  basic  operations  performed  by  a  program  can  be  expanded  to  accept  sym¬ 
bolic  inputs  and  produce  symbolic  formulae  as  output. 

If  the  flow  of  control  in  the  program  is  completely  independent  of  its  input  parameters,  then  all  output 
values  can  be  symbolically  computed  as  formulae  over  the  symbolic  inputs  and  examined  for  correctness.  When 
the  control  flow  of  the  program  is  input  dependent,  a  case  analysis  can  be  performed  producing  output  formulae 
for  each  class  of  inputs  determined  by  the  control  flow  dependencies.  Using  these  ideas,  we  have  designed  and 
implemented  an  interactive  debugging/testing  system  called  EFFIGY. 

[King76]  Abstract:  This  paper  describes  the  symbolic  execution  of  programs.  Instead  of  supplying  the  normal 
inputs  to  a  program  (e.g.  numbers)  one  supplies  symbols  representing  arbitrary  values.  The  execution  proceeds 
as  in  a  normal  execution  except  that  values  may  be  symbolic  formulas  over  the  input  symbols.  The  difficult,  yet 
interesting  issues  arise  during  the  symbolic  execution  of  conditional  branch  type  statements.  A  particular  system 
called  EFFIGY  which  provides  symbolic  execution  for  program  testing  and  debugging  is  also  described.  It  inter- 
pretatively  executes  programs  written  in  a  simple  PL/1  style  programming  language.  It  includes  many  standard 
debugging  features,  the  ability  to  manage  and  prove  things  about  symbolic  expressions,  a  simple  program  testing 
manager,  and  a  program  verifier.  A  brief  discussion  of  the  relationship  between  symbolic  execution  and  program 
proving  is  also  included. 

[Kitc81]  Abstract:  The  increasing  cost  of  software  development  and  maintenance  has  revealed  the  need  to  iden¬ 
tify  methods  that  encourage  the  production  of  high  quality  software.  This  in  turn  has  highlighted  the  need  to  be 
able  to  quantify  factors  influencing  the  amount  of  effort  needed  to  produce  such  software,  such  as  program  com¬ 
plexity. 

Two  approaches  to  the  problem  of  identifying  complexity  metrics  have  attracted  interest  in  America;  the 
theoretical  treatment  of  software  science  by  Halstead  of  Purdue  University  and  the  graph-theoretical  concept 
developed  by  McCabe  of  the  US  Department  of  Defense.  This  paper  reports  an  attempt  to  assess  the  ability  of 
the  measures  of  complexity  proposed  by  these  authors  to  provide  objective  indicators  of  the  effort  involved  in 
software  production,  when  applied  to  selected  subsystems  of  the  ICL  operating  system  VME/B.  The  proposed 
metrics  were  computed  for  each  of  the  modules  comprising  these  subsystems,  also  counts  of  the  numbers  of 
machine-level  instructions  (Primitive  Level  Instructions,  ‘PLI’)  and  measures  of  the  effort  involved  in  bringing 
the  modules  to  an  acceptable  standard  for  field  release.  It  was  found  that  all  the  complexity  metrics  were  corre¬ 
lated  positively  with  the  measure  of  effort,  those  modules  which  had  proved  more  difficult  having  large  values  for 
all  these  metrics.  However,  neither  Halstead’s  nor  McCabe’s  metrics  offered  any  substantial  improvement  over 
the  simple  ‘PLI’  count  as  predictors  of  effort. 

[Knig85a]  Abstract:  This  paper  describes  an  experiment  in  which  simple  syntactic  alterations  were  introduced 
into  program  text  in  order  to  evaluate  the  testing  strategy  known  as  error  seeding.  The  experiment’s  goal  was  to 
determine  if  randomly  placed  syntactic  manipulations  can  produce  failure  characteristics  similar  to  those  of  indi¬ 
genous  errors  found  within  unseeded  programs.  As  a  result  of  a  separate  experiment,  several  programs  were 
available,  all  of  which  were  written  to  the  same  specifications  and  thus  were  intended  to  be  functionally 
equivalent  programs  allowed  the  influence  of  individual  programmer  styles  to  be  removed  as  a  variable  from  the 
error  seeding  experiment.  Each  of  six  different  syntactic  manipulations  were  introduced  into  each  program  and 
the  mean  times  to  failure  for  the  seeded  errors  were  observed.  The  seeded  errors  were  found  to  have  a  broad 
spectrum  of  mean  times  to  failure  independent  of  the  syntactic  alteration  used.  We  conclude  that  it  is  possible  to 
seed  errors  using  only  simple  syntactic  techniques  that  are  arbitrarily  difficulty  to  locate.  In  addition,  several 
unexpected  results  indicate  that  some  issues  involved  in  error  seeding  have  not  been  addressed  previously. 
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[Knlg85b]  Abstract:  Symbolic  execution  is  the  execution  of  a  computer  program  with  symbolic  rather  than 
actual  values.  It  has  been  proposed  as  a  method  of  proving  that  a  program  is  correct  but  has  only  been  applied 
previously  to  sequential  programs^  The  introduction  of  concurrency  into  programming  languages  provides  many 
new  opportunities  for  programming  errors,  and,  because  of  the  nondeterminism,  errors  in  concurrent  programs 
are  often  harder  to  find  than  errors  in  sequential  programs.  In  this  paper  we  discuss  a  system  than  symbolically 
executes  concurrent  programs.  Rather  than  deal  with  concurrency  in  general,  the  system  described  here  deals 
with  the  concurrent  aspects  of  a  specific  programming  language;  namely  Ada.  We  chose  deliberately  to  investi¬ 
gate  all  the  detail  of  an  actual  programming  language,  and  we  chose  Ada  because  of  its  modern  design  and 
expected  widespread  use.  Our  goal  was  to  attempt  the  symbolic  execution  of  the  concurrent  features  of  Ada  to 
see  whether  useful  diagnostic  information  about  erroneous  Ada  programs  could  be  generated.  The  system 
described  is  partially  implemented  and  correctly  identifies  errors  that  are  not  caught  by  compilers. 

[Knig86a]  Abstract:  N-version  programming  has  been  proposed  as  a  method  of  incorporating  fault-tolerance 
into  software.  Multiple  versions  of  a  program  (i.e.,  “N”)  are  prepared  and  executed  in  parallel.  Their  outputs  are 
collected  and  examined  by  a  voter,  and,  if  they  are  not  identical,  it  assumes  that  the  majority  is  correct.  This 
method  depends  for  its  reliability  improvement  on  the  assumption  that  programs  that  have  been  developed 
independently  will  fail  independently.  In  this  paper,  an  experiment  is  described  in  which  the  fundamental  axiom 
is  tested.  A  total  of  27  versions  of  a  program  were  prepared  independently  from  the  same  specification  at  two 
universities  and  then  subjected  to  one  million  tests.  The  results  of  the  tests  revealed  that  the  programs  were  indi¬ 
vidually  extremely  reliable  but  that  the  number  of  tests  in  which  more  than  one  program  failed  was  substantially 
more  than  expected.  The  results  of  these  tests  are  presented  along  with  an  analysis  of  some  of  the  faults  tnat  were 
found  in  the  programs.  Background  information  on  the  programmers  used  is  also  summarized.  The  conclusion 
from  this  experience  is  that  N-version  programming  must  be  used  with  care  and  that  analysis  of  its  reliability 
must  include  the  effect  of  dependent  errors. 

[Knut71]  Summary:  A  sample  of  programs,  written  in  FORTRAN  by  a  wide  variety  of  people  for  a  wide  variety 
of  applications,  was  chosen  ‘at  random’  in  an  attempt  to  discover  quantitatively  ‘what  programmers  really  do.’ 
Statistical  results  of  this  survey  are  presented  here,  together  with  some  of  their  apparent  implications  for  future 
work  in  compiler  design.  The  principal  conclusion  which  may  be  drawn  is  the  importance  of  a  program  ‘profile,’ 
namely  a  table  of  frequency  counts  which  record  how  often  each  statement  is  performed  in  a  typical  run;  there 
are  strong  indications  that  profile-keeping  should  become  a  standard  practice  in  all  computer  systems,  for  casual 
users  as  well  as  system  programmers.  This  paper  is  the  report  of  a  three  month  study  undertaken  by  the  author 
and  about  a  dozen  students  and  representatives  of  the  software  industry  during  the  summer  of  1970.  It  is  hoped 
that  a  reader  who  studies  this  report  will  obtain  a  fairly  clear  conception  of  how  FORTRAN  is  being  used,  and 
what  compilers  can  do  about  it. 

[Knut73]  Abstract:  A  procedure  recently  devised  by  A.  Nahapetian,  for  reducing  the  number  of  measurements 
needed  to  determine  all  the  execution  frequencies  in  a  computer  program,  is  shown  to  be  optimal,  by  interpret¬ 
ing  the  procedure  in  a  new  way. 

[Kopp76]  Abbreviated  Introduction  and  Summary:  The  Process  Design  Engineering  Program,  under  the  direc¬ 
tion  of  the  Ballistic  Missile  Defense  Advanced  Technology  Center,  has  as  its  objective  the  development  of  a  uni¬ 
fied  software  engineering  discipline  addressing  all  software  development  problems  from  receipt  of  software 
requirements  to  delivery  of  the  operational  software  system.  During  the  first  two  years  of  this  program,  initial 
process  design  engineering  and  management  procedures  were  developed  which  led  to  the  systematic  top-down 
development  of  real-time  software  processes.  A  prototype  set  of  software  tools  to  support  these  procedures  was 
designed  and  implemented  as  Process  Design  System  1  (PDS  l),  and  an  experimental  BMD  baseline  software  pro¬ 
cess  was  then  designed  and  implemented  using  these  techniques  and  tools. 

This  paper  concentrates  on  some  of  the  features  of  the  Process  Design  System  2  and  the  manner  in  which 
its  components  interact.  Special  emphasis  is  placed  on  the  error  detection  capability  of  the  system  and  the 
characteristics  of  the  Process  Design  Language  (PDL).  The  current  status  of  PDS  and  planned  future  efforts  are 
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discussed. 

[Kore88]  Abstract:  A  recently  developed,  experimental,  integrated  System  for  Testing  And  Debugging  is 
presented.  Its  testing  part  supports  three  data  flow  coverage  criteria.  The  debugging  part  guides  the  programmer 
in  the  localization  of  faults  by  generating  and  interactively  verifying  hypotheses  about  their  location. 

[Krau88]  Abstract:  This  paper  describes  how  software  testing  using  mutation  analysis  can  be  performed  very 
efficiently  on  an  SIMD  machine.  Mutation  analysis  provides  effective  means  of  determining  the  reliability  of 
large  software  systems.  However,  the  cost  of  conducting  such  a  software  test  can  be  computationally  expensive. 
Current  implementations  [of]  mutation  tools  are  unacceptably  slow  and  are  only  suitable  for  testing  relatively 
small  programs. 

Our  research  has  shown  that  most  of  the  general  purpose  machine  architectures  available  commercially 
can  be  utilized  efficiently  to  carry  out  cost  effective  mutation  analysis  software  testing.  We  have  shown  this  to  be 
the  case  for  vector -multiprocessors.  In  this  paper,  we  develop  a  technique  that  permits  unified  scheduling  of  mul¬ 
tiple  mutant  programs  on  a  very  large  SIMD  machine.  We  believe  that,  for  the  first  time  in  the  field  of  software 
testing,  supercomputers  with  novel  architectures  can  be  used  to  enhance  software  productivity  by  employing 
techniques  like  the  one  proposed  in  this  paper. 

[Krie80]  Abstract:  ANNA  is  a  proposal  to  extend  Ada  to  include  facilities  for  formally  specifying  the  intended 
behavior  of  Ada  programs  (or  portions  thereof)  at  all  stages  of  program  development.  ANNA  programs  are  Ada 
programs  with  formal  comments.  Formal  comments  in  ANNA  consist  of  virtual  Ada  text  and  annotations.  The 
syntax  and  semantics  of  different  kinds  of  annotations  are  defined:  declarative  annotations  (for  variables,  sub- 
types,  subprograms,  and  packages),  statement  annotations,  exception  annotations,  and  visibility  annotations. 
ANNA  includes  a  small  number  of  predefined  attributes  which  may  appear  only  in  annotations,  e.g.,  access  type 
collections. 

The  lexical  structure  of  ANNA  is  designed  so  that  the  extensions  of  Ada  appear  as  Ada  comments. 
ANNA  programs  are  therefore  acceptable  by  Ada  translators.  The  semantics  of  annotations  are  defined  in  terms 
of  Ada  concepts,  in  particular  many  annotations  are  generalizations  of  the  constraint  concept.  It  is  therefore  a 
simple  step  for  the  Ada  programmer  to  use  ANNA  to  give  formal  specifications  of  programs. 

ANNA  is  intended  to  provide  a  formal  framework  within  which  different  theories  of  formal  specifications 
may  be  applied  to  Ada.  Our  proposal  omits  tasking  for  the  time  being. 

[Krie83]  Abstract:  One  of  the  major  concerns  in  the  design  of  the  Ada  programming  language  was  software  reli¬ 
ability.  Rigid  rules  are  stated  in  the  language  definition  that  allow  checking  of  program  properties  either  statically 
(i.e.,  during  the  compilation)  or  dynamically  (i.e.,  during  the  execution).  In  fact,  Ada  compilers  are  required  to 
perform  those  checks  and  give  error  messages  during  compilation  for  static  errors,  and  raise  predefined  excep¬ 
tions  during  execution  for  dynamic  errors.  If  a  dynamic  error  can  be  anticipated  during  compilation,  a  warning 
may  be  given,  but  the  respective  exception  must  still  be  raised  in  case  the  program  is  submitted  for  execution. 

[Krie86]  Summary:  The  PROSPECTRA  project  aims  to  provide  a  rigorous  methodology  for  developing  correct 
software  and  a  comprehensive  support  system.  It  is  sponsored  by  the  Commission  of  the  European  Communi¬ 
ties  under  the  ESPRIT  Programme,  ref.  390. 

The  methodology  integrates  program  construction  and  verification  during  the  development  process.  User 
and  implementor  start  with  a  formal  specification,  the  interface  or  “contract”.  This  initial  specification  is  then 
gradually  transformed  into  an  optimized  machine-oriented  executable  program.  The  final  version  is  obtained  by 
stepwise  application  of  transformation  rules.  These  are  carried  out  by  the  system,  with  interactive  guidance  by 
the  implementor,  or  automatically  by  compact  transformation  tools. 

The  final  version  is  correct  by  construction;  only  the  applicability  of  transformation  rules  needs  to  be  veri¬ 
fied  at  each  step,  assisted  by  the  system.  Transformation  rules  are  proved  correct,  analogously  to  theorems. 
They  form  the  nucleus  of  an  extendible  knowledge  base,  the  method  bank,  together  with  pre-fabricated  program 
components,  previous  program  versions,  and  entire  development  histories  that  can  be  replayed. 
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The  strict  methodology  of  Program  Development  by  Transformation  (based  on  the  CIP  approach)  is  com¬ 
pletely  supported  by  the  system,  enabling  the  construction  of  “a  priori”  correct  programs  from  formal  specifica¬ 
tions.  However,  the  system  also  allows  other  program  development  styles  where  the  user  assumes  responsibility 
for  unguarded  development  transitions.  Moreover,  it  will  be  possible  to  integrate  existing  program  components 
based  on  their  specification,  and  to  develop  them  further. 

The  system  comprises  basic  components  for  the  application  of  individual  transformation  rules  and  of  com¬ 
pact  development  methods  described  as  transformation  scripts;  these  provide  its  real  power.  Any  kind  of  system 
activity  is  conceptually  and  technically  regarded  as  a  transformation  of  a  “program”  at  one  of  the  system  layers. 
This  provides  for  a  uniform  user  interface,  reduces  system  complexity,  and  allows  the  construction  of  system 
components  in  a  highly  generative  way. 

[Krug88]  Abbreviated  Introduction:  Significant  improvement  in  software  reliability  calls  for  innovative  methods 
for  developing  software,  determining  its  readiness  for  release,  and  predicting  field  performance.  This  paper 
focuses  on  three  supporting  strategies  for  improving  software  quality.  First,  there  is  a  need  for  a  metric  or  a  set  of 
metrics  to  help  make  the  decision  of  when  to  release  the  product  for  customer  shipments.  Second,  accurately 
estimating  the  duration  of  system  testing,  while  not  directly  contributing  to  reliability,  makes  for  a  smoother 
introduction  of  the  product  to  the  marketplace. 

Finally,  achieving  significant  improvement  is  easier  given  the  ability  to  predict  field  failure  rates,  or 
perhaps  more  realistically,  to  compare  successive  software  products  upon  release.  Although  this  third  strategy 
will  be  discussed  in  this  paper,  the  emphasis  will  be  on  choosing  the  right  software  reliability  metric  and  confi¬ 
dently  managing  the  testing  effort  with  the  aid  of  software  reliability  growth  models. 

[Lam84]  Abstract:  The  method  of  projections  is  a  new  approach  to  reduce  the  complexity  of  analyzing  non¬ 
trivial  communication  protocol  entities  and  communication  channels.  Protocol  entities  interact  by  exchanging 
messages  through  channels;  messages  in  transit  may  be  lost,  duplicated  as  well  as  reordered.  Our  method  is 
intended  for  protocols  with  several  distinguishable  functions.  We  show  how  to  construct  image  protocols  for 
each  function.  An  image  protocol  is  specified  just  like  a  real  protocol.  An  image  protocol  system  is  said  to  be 
faithful  if  it  preserves  all  safety  and  liveness  properties  of  the  original  protocol  system  concerning  the  projected 
function.  An  image  protocol  is  smaller  than  the  original  protocol  and  can  typically  be  more  easily  analyzed.  Two 
protocol  examples  are  employed  herein  to  illustrate  our  method.  An  application  of  this  method  to  verify  a  ver¬ 
sion  of  the  high-level  data  link  control  (HDLC)  protocol  is  described  in  a  companion  paper. 

[Lamb78]  Abstract:  Increasing  the  communication  between  customer  and  contractor  is  seen  as  an  effective  way 
of  improving  software  quality.  Experience  with  a  variety  of  software  methods  has  clearly  focused  on  both  the 
need  for  communication  and  the  means  of  accomplishing  it.  A  methodology  for  defining  software  requirements 
and  design  has  been  developed.  It  is  based,  in  part,  on  a  synergism  of  modeling  techniques.  Experiences  with  the 
methodology  have  resulted  in  refinements.  Elements  of  the  methodology,  experience  with  it,  and  current  applica¬ 
tions  aimed  at  automating  its  use  are  described. 

[Lamp77]  Abstract:  The  inductive  assertion  method  is  generalized  to  permit  formal,  machine-verifiable  proofs 
of  correctness  for  multiprocess  programs.  Individual  processes  are  represented  by  ordinary  flowcharts,  and  no 
special  synchronization  mechanisms  are  assumed,  so  the  method  can  be  applied  to  a  large  class  of  multiprocess 
programs.  A  correctness  proof  can  be  designed  together  with  the  program  by  hierarchical  process  of  stepwise 
refinement,  making  the  method  practical  for  larger  programs.  The  resulting  proofs  tend  to  be  natural  formaliza¬ 
tion  of  the  informal  proofs  that  are  now  used. 

[Lamp78]  Abstract:  The  concept  of  one  event  happening  before  another  in  a  distributed  system  is  examined, 
and  is  shown  to  define  a  partial  ordering  of  the  events.  A  distributed  algorithm  is  given  for  synchronizing  a  sys¬ 
tem  of  logical  clocks  which  can  be  used  to  totally  order  the  events.  The  use  of  the  total  ordering  is  illustrated  with 
a  method  for  solving  synchronization  problems.  The  algorithm  is  then  specialized  for  synchronizing  physical 
clocks,  and  a  bound  is  derived  on  how  far  out  of  synchrony  the  clocks  can  become. 
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[Lamp79a]  Abstract:  A  nonassertional  approach  to  proving  multiprocess  correctness  is  described  by  proving 
the  correctness  of  a  new  algorithm  to  solve  the  mutual  exclusion  problem.  The  algorithm  is  an  improved  version 
of  the  bakery  exclusion  algorithm.  It  is  specified  and  proved  correct  without  being  decomposed  into  indivisible 
atomic  operations.  This  allows  two  different  implementations  for  a  conventional,  nondistributed  system.  More¬ 
over,  the  approach  provides  a  sufficiently  general  specification  of  the  algorithm  to  allow  nontrivial  implementa¬ 
tions  for  a  distributed  system  as  well. 

[Lamp79b]  Abstract:  A  formal  specification  is  given  for  a  simple  calendar  program,  and  the  derivation  and 
proof  of  correctness  of  the  program  are  sketched.  The  specification  is  easy  to  understand,  and  its  correctness  is 
manifest  to  humans. 

[Lamp80]  Abstract:  Hoare’s  logical  system  for  specifying  and  proving  partial  correctness  properties  of  sequen¬ 
tial  programs  is  generalized  to  concurrent  programs.  The  basic  idea  is  to  define  the  assertion  (P)S(Q)  to  mean 
that  if  execution  is  begun  anywhere  in  S  with  P  true,  then  P  will  remain  true  until  S  terminates,  and  Q  will  be  true 
if  and  when  S  terminates.  The  predicates  P  and  Q  may  depend  upon  program  control  locations  as  well  as  upon 
the  values  of  variables.  A  system  of  inference  rules  and  axiom  schemas  is  given,  and  a  formal  correctness  proof 
for  a  simple  program  is  outlined.  We  show  that  by  specifying  certain  requirements  for  the  unimplemented  parts, 
correctness  properties  can  be  proved  without  completely  implementing  the  program.  The  relation  to  Pnueli’s 
temporal  logic  formalism  is  also  discussed. 

[Lamp82]  Abstract:  Reliable  computer  systems  must  handle  malfunctioning  components  that  give  conflicting 
information  to  different  parts  of  the  system.  This  situation  can  be  expressed  abstractly  in  terms  of  a  group  of  gen¬ 
erals  of  the  Byzantine  army  camped  with  their  troops  around  an  enemy  city.  Communicating  only  by  messenger, 
the  generals  must  agree  upon  a  common  battle  plan.  However,  one  or  more  of  them  may  be  traitors  who  will  try 
to  confuse  the  others.  The  problem  is  to  find  an  algorithm  to  ensure  that  the  loyal  generals  will  reach  agreement. 
It  is  shown  that,  using  only  oral  messages,  this  problem  is  solvable  if  and  only  if  more  than  two-thirds  of  the  gen¬ 
erals  are  loyal;  so  a  single  traitor  can  confound  two  loyal  generals.  With  unforgetable  written  messages,  the  prob¬ 
lem  is  solvable  for  any  number  of  generals  and  possible  traitors.  Applications  of  the  solutions  to  reliable  com¬ 
puter  systems  are  then  discussed. 

[Lamp83]  Abstract:  A  method  for  specifying  program  modules  in  a  concurrent  program  is  described.  It  is  based 
upon  temporal  logic,  but  uses  new  kinds  of  temporal  assertions  to  make  the  specifications  simpler  and  easier  to 
understand.  The  semantics  of  the  specifications  is  described  informally,  and  a  sequence  of  examples  are  given 
culminating  in  a  specification  of  three  modules  comprising  the  alternating-bit  communication  protocol.  A  formal 
semantics  is  given  in  the  appendix. 

[Lamp84]  Abstract:  Generalized  Hoare  Logic  is  a  formal  logical  system  for  deriving  invariance  properties  of 
programs.  It  provides  a  uniform  way  to  describe  a  variety  of  methods  for  reasoning  about  concurrent  programs, 
including  noninterference,  satisfaction,  and  cooperation  proofs.  We  describe  a  simple  meta-rule  of  the  General¬ 
ized  Hoare  Logic  -  the  Decomposition  Principle  -  and  show  how  all  these  methods  can  be  derived  using  it. 

[Land77]  Abstract:  Modeling  of  systems  featuring  hardware  and  software  faults  is  studied  as  a  means  of  evaluat¬ 
ing  the  availability  and  reliability  characteristics.  The  case  of  a  non-redundant  computer  is  studied  and  it  is 
shown  that  the  unavailability  presents  an  overshoot  with  respect  to  its  asymptotic  value  whose  height  and  length 
are  functions  of  the  failure  rates  associated  with  the  different  design  errors.  Also,  a  fault-tolerant  system  is  stu¬ 
died  that  includes  protective  redundancies  at  the  hardware  and  software  levels. 

[Land86]  Abbreviated  Introduction:  The  Naval  Research  Laboratory  sponsored  this  workshop  to  invigorate 
research  in  both  program  verification  and  program  testing  through  cross-fertilization,  to  document  the  state  of 
the  art  and  practice  in  both  areas,  and  to  identify  current  assurance  requirements  and  techniques  for  meeting 
them.  Tutorials  characterizing  the  current  state  of  testing  and  proving  techniques  and  identifying  industry  and 
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government  assurance  requirements  provided  a  common  basis  for  five  discussion  groups.  These  groups 
addressed  (1)  the  roie  of  specifications  in  testing  and  proving,  (2)  hybrid  approaches  of  testing  and  proving,  (3) 
levels  of  assurance,  (4)  interactions  between  testing/proving  and  software  engineering,  and  (S)  cost  effectiveness. 

[Lapr84]  Abstract:  This  paper  deals  with  the  evaluation  of  the  dependability  (considered  as  a  generic  term, 
whose  main  measures  are  reliability,  availability,  and  maintainability)  of  software  systems  during  their  opera¬ 
tional  life,  in  contrast  to  most  of  the  work  performed  up  to  now,  devoted  mainly  to  development  and  validation 
phases. 

The  failure  process  due  to  design  faults,  and  the  behavior  of  a  software  system  up  to  the  first  failure  and 
during  its  life  cycle  are  successively  examined.  An  approximate  model  is  derived  which  enables  one  to  account 
for  the  failures  due  to  the  design  faults  in  a  simple  way  when  evaluating  a  system’s  dependability.  This  model  is 
then  used  for  evaluating  the  dependability  of  1)  a  software  system  tolerating  design  faults,  and  2)  a  computing  sys¬ 
tem  with  respect  to  physical  and  design  faults. 

[Lask79]  Summary:  A  real-environment  interactive  testing  procedure  is  presented  which  is  based  upon  a 
hierarchical  decomposition  of  a  program  into  levels  of  abstraction.  Such  a  decomposition  is  defined  in  terms  of  a 
program  model  which  involves  both  control  and  data  How.  The  testing  strategy  adopted  is  supposed  to  follow  a 
typical  progress  of  a  programmer  which  carries  out  a  series  of  experiments  with  his  program.  Several  semantical 
and  structural  issues  involved  are  discussed. 

[Lask82]  Abstract:  A  structural  approach  to  testing  employing  properties  of  data  How  in  a  program  is  proposed. 
The  basic  notion  introduced  is  that  of  data  context  of  a  program  block.  It  represents  the  set  of  all  tuples  of  defini¬ 
tions  of  the  block  arguments  that  are  simultaneously  live  when  the  control  reaches  the  block.  Two  testing  stra¬ 
tegies  have  been  proposed:  block  testing,  exercising  every  block  of  all  its  elementary  contexts  and  d-tree  testing 
exercising  the  definition  tree  rooted  at  an  elementary  contest  of  the  stop/exit  instruction. 

[Lask83]  Abstract:  Some  properties  of  a  program’s  data  flow  can  be  used  to  guide  program  testing.  The 
presented  approach  aims  to  exercise  use-definition  chains  that  appear  in  the  program.  Two  such  data  oriented 
testing  strategies  are  proposed;  the  first  involves  checking  liveness  of  every  definition  of  a  variable  at  the  point(s) 
of  its  possible  use;  the  second  deals  with  liveness  of  vectors  of  variables  treated  as  arguments  to  an  instruction  or 
program  block.  Reliability  of  these  strategies  is  discussed  with  respect  to  a  program  containing  an  error. 

[Lask86]  Abstract:  A  codefinition  is  a  set  of  definitions  in  the  program  that  simultaneously  reach  an  instruction 
in  it.  Codefinitions  are  used  in  data-based  program  testing.  An  intraprocedural  iterative  algorithm  for  the  deriva¬ 
tion  of  codefinitions  is  presented.  It  has  been  applied  in  a  data-based  testing  tool  recently  implemented. 

[Laak88a]  Abstract:  A  program  design  methodology  is  presented  that  advocates  the  synthesis  of  tests 
hand-in-hand  with  the  design  at  every  stage  of  program  development  and  uses  them  for  early  detection  of  design 
flaws.  It  involves  formal  specifications  of  abstract  programs  and  abstract  data  refinement  that  appear  in  the 
design.  Main  findings:  1)  Formalization  facilitates  black-box  and  design-based  functional  testing,  2)  Abstract 
data  testing  allows  a  more  natural  selection  of  tests  than  concrete  data  testing,  3)  Black-box  testing  leads  to  signi¬ 
ficant  structural  coverage,  4)  The  method  can  be  combined  with  formal  verification. 

[Lasa79]  Abstract:  Several  examples  of  simple  program  schemes  are  used  to  study  the  influence  of  basic  con¬ 
structs  on  measures  of  Software  Science.  A  minimal  low  level  language  is  used  in  order  to  build  examples  which 
contain  large  numbers  of  the  constructs  under  study.  The  measures  are  expressed  as  functions  depending  on  the 
number  of  conceptually  unique  input-output  operands.  They  may  therefore  be  evaluated  and  compared  to  their 
estimators  analytically  rather  than  statistically. 

[Lass81]  Abstract:  The  claims  that  software  science  could  provide  an  empirical  basis  for  the  rationalization  of 
all  forms  of  algorithm  description  are  shown  to  be  invalid  from  a  formal  point  of  view.  In  particular,  the 
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conjectured  dichotomy  between  operators  and  operands  is  shown  not  to  hold  over  a  wide  class  of  languages.  An 
experiment  that  investigated  discrepancies  between  the  level  measure  and  its  estimator  is  described  to  show  that 
its  failure  was  due  to  shortcomings  in  the  theory.  One  cannot  obtain  reliable  results  without  tampering  with  both 
measure  and  estimator  definitions. 

[Laue79]  Summary:  Debugging  is  efficient  if  it  detects  all  program  errors  in  a  short  time.  This  paper  discusses 
several  techniques  for  improving  debugging  efficiency  Attention  is  given  both  to  the  initial  debugging  and  to 
acceptance  testing  in  the  maintenance  stage.  A  main  decision  is  whether  to  use  top-down  or  bottom-up  debug¬ 
ging,  and  it  is  suggested  that  top-down  debugging  is  more  efficient  if  combined  with  some  of  the  other  techniques. 
All  the  techniques  shown  are  independent  of  any  particular  language  or  debug  software. 

[Lave88]  Abstract:  This  paper  is  a  discussion  of  issues  related  to  the  thesis  entitled  “The  Explication  of  Process- 
Product  Relationships  in  DoD-STD-2167  and  DoD-STD-2168  via  an  Augmented  Data  Flow  Diagram  Model.” 

In  particular,  the  major  results  of  the  above  thesis  are  viewed  in  light  of  the  draft  standards  DoD- 
STD-2167A  and  DoD-STD-2168  (both  dated  1  April  1987),  and  the  issue  of  development  objectives  is  explored. 

The  ideas  presented  in  this  paper  represent  the  author’s  opinion  and  are  speculative  in  nature  due  to  the 
fact  that,  at  present,  the  revised  DoD  standards  are  in  draft  form,  and  the  issue  of  development  objectives  has 
not  yet  been  thoroughly  investigated. 

[Lawr81]  Abstract:  Programming  data  involving  278  commercial-type  programs  were  collected  from  23 
medium-to-large-scale  organizations  in  order  to  explore  the  relationships  among  variables  measuring  program 
type,  the  testing  interface,  programming  technique,  programmer  experience,  and  productivity.  Programming 
technique  and  programmer  experience  after  1  year  were  found  to  have  no  impact  on  productivity,  whereas 
on-line  testing  was  found  to  reduce  productivity.  A  number  of  analyses  of  the  data  are  presented,  and  their  rela¬ 
tionship  to  other  studies  is  discussed. 

[LeDo85]  Abstract:  A  trace  database  model  for  debugging  concurrent  Ada  programs  is  presented.  In  this 
approach,  trace  information  is  captured  in  an  historical  database  and  queried  using  Prolog.  This  model  was  used 
to  build  a  prototype  debugger,  called  Your  Own  Ada  Debugger  (YODA).  The  design  of  YODA  is  described  and 
a  trace  analysis  of  a  sample  program  exhibiting  misuse  of  shared  data  is  presented.  Because  the  trace  database 
model  is  flexible  and  general,  it  can  aid  diagnosis  of  a  variety  of  runtime  errors. 

[L«ac87]  Abstract:  A  major  goal  of  software  engineering  research  is  the  development  of  metrics  which  measure 
the  complexity  and  maintainability  of  programs,  with  a  small  portion  of  this  effort  directed  specifically  towards 
programs  written  in  Ada.  This  paper  will  focus  on  two  main  themes.  The  first  theme  will  be  the  development  of 
metrics  that  specifically  reflect  the  complexity  of  programs  in  Ada.  The  second  theme  will  be  an  investigation  of 
the  theoretical  limits  of  metrics  as  measures  of  program  complexity  in  general. 

[Lee88]  Abstract:  Software  creation  requires  not  only  testing  during  the  development  cycle  by  the  development 
staff,  but  also  independent  testing  following  the  completion  of  the  implementation.  Howev  ,r  in  the  latter  case, 
the  amount  of  testing  that  can  be  carried  out  is  often  limited  by  time  and  resources.  At  the  very  most,  indepen¬ 
dent  testing  can  be  expected  to  provide  100%  test  coverage  of  the  test  requirements  (or  specifications)  associated 
with  the  software  element  with  the  minimum  of  effort.  This  paper  describes  a  methodology  employing  Integer 
Programming  by  which  the  amount  of  testing  required  to  provide  the  maximum  possible  test  coverage  of  the 
requirements  (for  the  given  test  set)  is  assured  while  at  the  same  time  minimizing  the  total  number  of  tests  to  be 
included  in  a  test  suite.  A  collateral  procedure  provides  recommendations  on  which  tests  might  be  eliminated  if 
less  than  100%  test  coverage  of  the  test  requirements  is  permitted.  This  latter  procedure  will  be  useful  in  deter¬ 
mining  the  risk  of  not  running  the  minimum  set  of  tests  for  100%  test  coverage.  A  third  process  selects  from  the 
test  matrix  the  set  of  tests  to  be  applied  to  the  system  following  maintenance  modifications  of  any  test  require¬ 
ments  -  that  is,  to  provide  a  submatrix  for  regression  testing.  The  potential  benefits  for  applying  the  integer  pro¬ 
gramming  technique  in  test  data  selection  is  also  discussed. 
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[LessSl]  Abstract:  A  new  approach  for  structuring  distributed  processing  systems,  called  functionally  accurate, 
cooperative  (FA/C),  is  proposed.  The  approach  differs  from  conventional  ones  in  its  emphasis  on  handling  dis¬ 
tribution-caused  uncertainty  and  errors  as  an  integral  part  of  the  network  problem-solving  process.  In  this 
approach  nodes  cooperatively  problem-solve  by  exchanging  partial  tentative  results  (at  various  levels  of  abstrac¬ 
tion)  within  the  context  of  common  goals.  The  approach  is  especially  suited  to  applications  in  which  the  data 
necessary  to  achieve  a  solution  cannot  be  partitioned  in  such  a  way  that  a  node  can  complete  a  task  without  see¬ 
ing  the  intermediate  state  of  task  processing  at  other  nodes.  Much  of  the  inspiration  for  the  FA/C  approach 
comes  from  the  mechanisms  used  in  knowledge-based  artificial  intelligence  (AI)  systems  for  resolving  uncer¬ 
tainty  caused  by  noisy  input  data  and  the  use  of  approximate  knowledge,  Tlie  appropriateness  of  the  FA/C 
approach  is  explored  in  three  application  domains:  distributed  interpretation,  distributed  network  traffic-light 
control,  and  distributed  planning.  Additionally,  the  relationship  between  the  approach  and  the  structure  of 
mangement  organizations  is  developed.  Finally,  a  number  of  current  research  directions  necessary  to  more  fully 
develop  the  FA/C  approach  are  outlined.  These  research  directions  include  distributed  search,  the  integration  of 
implicit  and  explicit  forms  of  control,  and  distributed  planning  and  organizational  self-design. 

[Letra88]  Abstract:  Regression  testing  is  a  significant,  but  largely  unexplored  topic.  In  this  report,  the  problem 
of  regression  testing  is  analysed,  and  several  important  notions  are  introduced:  the  types  of  regression  testing, 
the  test  case  classification  according  to  changes  and  the  regression  number.  Regression  testing  can  be  grouped 
into  corrective  regression  testing  and  progressive  regression  testing,  depending  on  the  stability  of  the  specification. 
The  test  cases  can  be  grouped  into  five  classes:  reusable,  testable,  obsolete,  changed  and  new  test  cases.  A  prob¬ 
lem  facing  all  retesters  is  the  proper  identification  of  test  classes.  The  notion  of  regression  number  is  introduced 
as  a  measure  of  the  number  of  test  cases  affected  by  a  single  instruction  change.  A  program  component  called  a 
retestable  unit  is  proposed  to  encapsulate  the  effect  of  changes  on  a  program.  The  use  of  retestable  unit  may 
reduce  the  effort  in  test  selection  for  regression  testing.  An  algorithm  for  computing  a  retestable  unit  is  given, 
and  a  preliminary  experiment  on  retestable  unit  is  reported.  The  regression  testing  problem  can  be  decomposed 
into  two  subproblems:  the  test  selection  problem  and  the  test  plan  update  problem.  This  report  presents  a  solution 
to  the  test  plan  update  problem,  which  involves  the  use  of  a  unique  data  structure  for  storing  program  informa¬ 
tion  during  testing.  The  data  structure  allows  an  easy  manipulation  of  these  information  for  the  purpose  of  classi¬ 
fying  the  test  cases. 

[Leve83b]  Abstract:  With  the  increased  use  of  software  controls  in  critical  real-time  applications,  a  new  dimen¬ 
sion  has  been  introduced  into  software  reliability  -  the  “cost”  of  errors.  The  problems  of  safety  have  become  crit¬ 
ical  as  these  applications  have  increasingly  included  areas  where  the  consequences  of  failure  are  serious  and  may 
involve  grave  dangers  to  human  life  and  property.  This  paper  defines  software  safety  and  describes  a  technique 
called  software  fault  tree  analysis  which  can  be  used  to  analyze  a  design  as  to  its  safety.  The  technique  has  been 
applied  to  a  program  which  controls  the  flight  and  telemetry  for  a  University  of  California  spacecraft.  A  critical 
failure  scenario  was  detected  by  the  technique  which  had  not  been  revealed  during  substantial  testing  of  the  pro¬ 
gram.  Parts  of  this  analysis  are  presented  as  an  example  of  the  use  of  the  technique  and  the  results  are  discussed. 

[Leve83c]  Abstract:  Software  is  increasingly  being  used  in  the  control  of  potentially  hazardous  systems. 
Software  fault-tree  analysis  is  a  technique  for  analyzing  the  logic  of  software  for  any  potential  contribution  to  sys¬ 
tem  mishaps.  The  technique  is  described  using  Ada  as  an  example  real-time  language.  Special  consideration  is 
given  to  the  problems  of  concurrency  and  real-time  constraints  which  are  common  in  these  types  of  applications. 

[Leve86b]  Abstract:  Software  safety  issues  become  important  when  computers  are  used  to  control  real-time, 
safety-critical  processes.  This  survey  attempts  to  explain  why  there  is  a  problem,  what  the  problem  is,  and  what  is 
known  about  how  to  solve  it.  Since  this  is  a  relatively  new  software  research  area,  emphasis  is  placed  on  delineat¬ 
ing  the  outstanding  issues  and  research  topics. 

[L«ve87]  Abstract:  The  application  of  Time  Petri  net  modeling  and  analysis  techniques  to  safety-critical  real¬ 
time  systems  is  explored  and  procedures  described  which  allow  analysis  of  safety,  recoverability,  and 
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fault-tolerance. 

[Levi78]  Panel  Overview:  Formal  methods,  i.e.,  use  of  mathematical  rigor,  have  been  employed  by  research 
computer  scientists  in  their  attempt  to  develop  general  results  for  many  aspects  of  computer  science,  e.g.,  com¬ 
putational  complexity,  undecidabUity,  numerical  analysis,  programming  language  semantics.  Much  of  this  work 
has  had  little  impact  on  those  charged  with  producing  working  software  systems.  However,  in  recent  years 
numerous  researchers  have  suggested  that  by  applying  formal  methods  to  the  realization  of  systems,  the  quality 
of  such  systems  could  be  significantly  improved.  Such  formal  methods  could  be  applied  in  the  structuring, 
specification,  verification,  and  analysis  of  performance  for  systems.  The  position  statements  below  explore  the 
use  of  these  techniques  in  the  production  of  systems.  The  general  opinion  is  that  formal  methods  will  ultimately 
assume  a  vital  role,  but  for  the  present  their  use  will  be  restricted  to  particular  systems  produced  by  skilled  indivi¬ 
duals.  The  use  of  the  formal  methods  will  gradually  increase  as  the  techniques  are  refined  and  applied  to  a  larger 
variety  of  systems,  as  tools  are  developed  to  support  their  use,  and  as  the  general  community  becomes  better 
educated  in  formal  methods. 

[Levi 80]  Abstract:  This  thesis  presents  proof  rules  for  an  extension  of  Hoar’s  Communicating  Sequential 
Processes  (CSP).  CSP  is  a  notation  for  describing  processes  that  interact  through  communication,  which  pro¬ 
vides  the  sole  means  of  synchronizing  and  passing  information  between  processes.  A  sending  process  is  delayed 
until  some  process  is  ready  to  receive  the  message;  a  receiving  process  is  delayed  until  there  is  a  message  to  be 
received.  It  is  this  delay  that  provides  synchronization. 

A  proof  of  a  program  is  with  respect  to  pre-  and  post-conditions.  A  proof  of  weak  correctness  shows  that 
execution  of  the  program  beginning  in  a  state  satisfying  the  pre-condition  terminates  in  a  state  satisfying  the  post¬ 
condition,  providing  deadlock  does  not  occur.  A  proof  of  strong  correctness,  in  addition,  shows  that  deadlock 
cannot  occur. 

A  proof  of  weak  correctness  has  three  stages:  a  sequential  proof,  a  satisfaction  proof,  and  a  non-interfer¬ 
ence  proof.  A  sequential  proof  reflects  the  efforts  of  a  process  running  in  isolation.  A  satisfaction  proof  com¬ 
bines  sequential  proofs  of  processes,  reflecting  the  message  passing  and  synchronization  aspects  of  communica¬ 
tion.  A  non-interference  proof  shows  that  no  process  affects  the  validity  of  the  proof  of  another  process. 

The  introduction  of  the  satisfaction  proof  and  our  symmetric  treatment  of  send  and  receive  are  important 
aspects  of  this  thesis.  By  treating  send  and  receive  on  an  equal  basis,  we  simplify  our  rules  and  allow  the  inclu¬ 
sion  of  send  in  guards. 

A  sufficient  condition  for  freedom  from  deadlock  is  given  that  depends  on  the  proof  of  weak  correctness; 
this  is  used  to  prove  strong  correctness.  In  general,  freedom  from  deadlock  can  be  very  hard  to  check.  There¬ 
fore,  we  derive  special  cases  in  which  we  can  reduce  the  work  needed  to  verify  that  a  program  is  free  from 
deadlock. 

We  also  present  an  algorithm  for  globally  synchronizing  processes;  that  is,  each  process  can  recognize  that 
all  processes  are  simultaneously  in  a  given  state.  It  works  by  recognizing  a  special  class  of  deadlock.  Having  this 
algorithm  allows  us  to  modify  programs  that  deadlock  when  the  post-condition  is  established,  so  that  they  ter¬ 
minate  normally. 

[Lev<81]  Abstract:  Proof  rules  are  presented  for  an  extension  of  Hoare’s  communicating  sequential  processes. 
The  rules  deal  with  total  correctness;  all  programs  terminate  in  the  absence  of  deadlock.  The  commands  send 
and  receive  are  treated  symmetrically,  simplifying  the  rules  and  allowing  send  to  appear  in  guards.  Also  given  are 
sufficient  conditions  for  showing  that  a  program  is  deadlock  free.  An  extended  example  illustrates  the  use  of  the 
technique. 

[Levy 84]  Abstract:  A  strategy  for  performing  type  checking  on  programs  built  out  of  separately  compiled  parts  is 
presented.  This  strategy  is  used  in  a  programming  environment  that  allows  small  components  of  a  software  sys¬ 
tem  to  be  reconfigured  in  different  ways.  The  strategy  works  by  inferring  type  schemas  for  all  of  the  undeclared 
functions  used  by  a  component  and  then  unifying  each  schema  with  a  program  library  when  a  configuration  is 
built. 
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[Lew88]  Abstract:  The  complexity  of  software  often  affects  its  reliability.  In  order  to  produce  reliable  software, 
its  complexity  must  be  controlled  by  suitably  decomposing  the  software  system  into  smaller  subsystems.  In  this 
paper,  a  software  complexity  metric  is  developed  which  includes  both  the  internal  and  external  complexity  of  a 
module.  This  allows  analysis  of  a  software  system  during  its  development  and  provides  a  guide  to  system  decom¬ 
position.  The  basis  of  this  complexity  metric  is  in  the  development  of  an  external  complexity  measure  which 
characterizes  module  interaction. 

[LI87]  Abstract:  Software  metrics  are  computed  for  the  purpose  of  evaluating  certain  characteristics  of  the 
software  developed.  A  Fortran  static  source  code  analyzer,  FORTRANL,  was  developed  to  study  31  metrics, 
including  a  new  hybrid  metric  introduced  in  this  paper,  and  applied  to  a  database  of  255  programs,  all  of  which 
were  student  assignments.  Comparisons  among  these  metrics  are  performed.  Their  cross-correlation  confirms 
the  internal  consistency  of  some  of  these  metrics  which  belong  to  the  same  class.  To  remedy  the  incompleteness 
of  most  of  these  metrics,  the  proposed  metric  incorporates  context  sensitivity  to  structural  attributes  extracted 
from  a  flow  graph.  It  is  also  concluded  that  many  volume  metrics  have  similar  performance  while  some  control 
metrics  surprisingly  correlate  well  with  typical  volume  metrics  in  the  test  samples  used.  A  flexible  class  of  hybrid 
metric  can  incorporate  both  volume  and  control  attributes  in  assessing  software  complexity. 

[Lind85]  Abbreviated  Introduction:  This  paper  describes  a  specification  and  validation  method  that  allows  vali¬ 
dation  tests  to  be  generated  in  a  White-Box  fashion  and  administered  in  a  Black-Box  fashion.  The  paper  presents 
a  specification  technique  for  CAIS  that  uses  an  Ada-based  description  of  CAIS  facilities.  This  Abstract 
Machine  approach  to  specifying  CAIS  is  summarized.  The  paper  addresses  a  two-phase  approach  to  developing 
validation  test  from  The  Abstract  description  of  CAIS.  In  the  first  phase,  existing  testing  technology  is  applied 
to  isolate  needed  test  data  points  in  terms  of  the  inputs  and  expected  outputs.  The  second  phase  converts  the  test 
data  points,  using  further  analysis  of  the  specification,  into  Ada  tools  that  do  not  rely  on  data  internal  to  the 
Abstract  description. 

[Lind88a]  Abstract:  This  analysis  tool  produces  I/O  pairs  that  represent  program  execution  paths.  You  can  use 
these  pairs  as  hurdles  for  program  testing  and  interface  validation  to  overcome. 

[Lind88d]  Abbreviated  Introduction:  As  the  title  suggests,  this  paper  is  a  survey  of  computer  support  for  formal 
reasoning,  but  primarily  from  the  point  of  view  of  software  engineering  applications.  It  makes  no  claim  to  being 
an  objective  comparison  of  theorem  proving  systems  per  se,  nor  does  it  claim  to  present  all  the  features  of  the 
various  systems.  Instead,  it  is  intended  to  be  an  introduction  to  existing  systems  and  ongoing  research,  gathering 
together  information  which  has  often  only  appeared  before  in  narrowly  distributed  technical  reports. 

[Lind89]  Abstract:  In  this  paper,  we  report  the  results  of  an  experimental  study  of  software  metrics  for  a  fairly 
large  software  system  used  in  a  real-time  application.  We  examine  a  number  of  issues,  including  the  mutual  rela¬ 
tionship  between  various  software  metrics  and,  more  importantly,  the  relationship  between  metrics  and  the 
development  effort.  We  report  some  interesting  connections  between  metrics  and  the  software  development 
effort. 

[Ling79]  Table  of  Contents:  Precision  Programming.  Elements  of  logical  expression.  Elements  of  program 
expression,  syntax  control  structures,  syntax  data  structures,  syntax  system  structures,  structured  programs,  pro¬ 
gram  execution,  program  functions,  program  structures.  Reading  structured  programs.  The  correctness  of  struc¬ 
tured  programs,  verifying  structured  programs,  correctness  of  prime  programs,  techniques  for  proving  program 
correctness,  examples,  loop  invariants  in  correctness  proofs,  formulas  for  correct  structured  programs.  Writing 
structured  programs. 

[Linn88]  Abstract:  IDA  Paper  P-2035  presents  the  SDI  Architecture  Dataflow  Modeling  Technique  (SADMT), 
a  uniform  formal  notation  for  the  description  of  SDI  system  architectures  and  the  Battle  Management  and  Com¬ 
mand,  Control,  and  Communication  (BM/C3)  architectures.  SADMT  is  a  technique  for  thinking  about  and 
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describing  architectural  processes  and  structures  that  use  the  typing  and  functional  facilities  of  the  Ada  program¬ 
ming  language.  The  document  defines  SADMT  and  the  programming  interface  to  the  SADMT  Simulation  Facil¬ 
ity  (SADMT/SF).  The  issues  addressed  here  are  those  relevant  to  providing  formal  descriptions  of  system, 
structure  and  behavior  for  interface  consistency  checking,  system  simulation,  and  system  evaluation. 

[Lisk75]  Abstract:  The  main  purposes  in  writing  this  paper  are  to  discuss  the  importance  of  formal  specifica¬ 
tions  and  to  survey  a  number  of  promising  specification  techniques.  The  role  of  formal  specifications  both  in 
proofs  of  program  correctness,  and  in  programming  methodologies  leading  to  programs  which  are  correct  by 
construction,  is  explained.  Some  criteria  are  established  for  evaluating  the  practical  potential  of  specification 
techniques.  The  importance  of  providing  specifications  at  the  right  level  of  abstraction  is  discussed,  and  a  partic¬ 
ularly  interesting  class  of  specification  techniques,  those  used  to  construct  specifications  of  data  abstractions,  is 
identified.  A  number  of  specification  techniques  for  describing  data  abstractions  are  surveyed  and  evaluated  with 
respect  to  the  criteria.  Finally,  directions  for  future  research  are  indicated. 

[Lite76]  Abstract:  The  paper  provide  data  on  Cobol  error  frequency  for  correction  of  errors  in  student-oriented 
compilers,  improvement  of  teaching,  and  changes  in  programming  language.  Cobol  was  studied  because  of 
economic  importance,  widespread  usage,  possible  error-inducing  design,  and  lack  of  research.  The  types  of 
errors  were  identified  in  a  pilot  study;  then,  using  the  132  error  types  found,  1,777  errors  were  classified  in  1,400 
runs  of  73  Cobol  students.  Error  density  was  high:  20  percent  of  the  types  contained  80  percent  of  the  total  fre¬ 
quency,  which  implies  high  potential  effectiveness  for  software-based  correction  of  Cobol.  Surprisingly,  only  four 
high-frequency  errors  were  error-prone,  which  implies  minimal  error  inducing  design.  80  percent  of  Cobol 
misspellings  were  classifiable  in  the  four  error  categories  of  previous  researchers,  which  implies  that  Cobol 
misspellings  are  correctable  by  existent  algorithms.  Reserved  word  usage  was  not  error-prone,  which  implies 
minimal  interference  with  usage  of  reserved  words.  Over  80  percent  of  error  diagnosis  was  found  to  be  inaccu¬ 
rate.  Such  feedback  is  not  optimal  for  users,  particularly  for  the  learning  user  of  Cobol. 

[Litt73]  Summary:  A  Bayesian  reliability  growth  model  is  presented  which  includes  special  features  designed  to 
reproduce  special  properties  of  the  growth  in  reliability  of  an  item  of  computer  software  (program).  The  model 
treats  the  situation  where  the  program  is  sufficiently  complete  to  work  for  continuous  time  periods  between 
failures,  and  gives  a  repair  rule  for  the  action  of  the  programmer  at  such  failures.  Analysis  is  based  entirely  upon 
the  length  of  the  periods  of  working  between  repairs  and  failures,  and  does  not  attempt  to  take  account  of  the 
internal  structure  of  the  program.  Methods  of  inference  about  the  parameters  of  the  model  are  discussed. 

[Litt75]  Abstract:  A  system  is  considered  in  which  switching  takes  place  between  sub-systems  according  to  a 
continuous  parameter  Markov  chain.  Failures  may  occur  in  Poisson  processes  in  the  sub-systems,  and  in  the 
transitions  between  sub-systems.  All  failure  processes  are  independent.  The  overall  failure  process  is  described 
exactly  and  asymptotically  for  highly  reliable  sub-systems.  An  application  to  process-control  computer  software 
is  suggested. 

[Litt78]  Abstract:  This  paper  examines  critically,  with  a  view  to  stimulating  a  discussion,  some  concepts  which 
have  been  used  in  early  work  on  software  reliability  measurement,  and  suggests  improvements  and  areas  of 
potentially  fruitful  future  research.  It  is  proposed  that  hardware-motivated  measures  such  as  mttf,  mtbf  should 
not  be  used  for  software  without  justification,  and  it  is  shown  that  such  justification  may  be  lacking  under  quite 
unexceptionable  circumstances.  Alternative  methods  of  measuring  software  reliability  are  proposed.  Emphasis 
is  placed  upon  differentiating  between  two  concepts  of  software  reliability  which  are  often  blurred  in  the  work  of 
previous  authors.  These  are,  on  the  one  hand,  the  reliability  of  the  program-as-it-is  (the  number  of  bugs  it  con¬ 
tains),  on  the  other,  the  reliability  of  the  program-as-it-performs  (failure  rate,  distribution  of  time  to  next  failure, 
etc.).  It  is  argued  that  the  latter,  here  called  operations  reliability,  is  the  one  we  should  use.  Measures  of  opera¬ 
tional  reliability  which  avoid  use  of  mttf,  etc.,  are  proposed.  A  case  is  made  for  software  engineers  adopting  a 
Bayesian  stand-point:  both  in  the  interpretation  of  probability  statements  and  in  inference  procedures.  It  is  sug¬ 
gested  that  reliability  modeling  solely  in  terms  of  failures  (or  number  of  bugs)  is  unnecessarily  naive.  Interest 
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really  centers  upon  the  consequences  of  failures  as  much  as  on  their  frequency.  It  is  proposed  that  more  effort  be 
devoted  to  the  development  of  models  which  incorporate  a  cost  (or  utility)  structure.  Finally,  brief  consideration 
is  given  to  the  question  of  program  structure.  The  enormous  success  of  hardware  reliability  theory,  in  combining 
component  reliabilities  with  knowledge  of  system  structure,  must  be  emulated  for  software.  Unfortunately, 
software  structure  does  not  easily  lend  itself  to  such  an  exercise.  Some  existing  models  are  considered. 

[Litt79]  Abstract:  The  paper  treats  a  modular  program  in  which  transfers  of  control  between  modules  follow  a 
semi-Markov  process.  Each  module  is  failure  prone,  and  the  different  failure  processes  are  assumed  to  be  Pois¬ 
son.  The  transfers  of  control  between  modules  (interfaces)  are  themselves  subject  to  failure.  The  overall  failure 
process  of  the  program  is  described,  and  an  asymptotic  Poisson  approximation  is  given  for  the  case  when  the 
individual  modules  and  interfaces  are  very  reliable.  A  simple  formula  gives  the  failure  rate  of  the  overall  pro¬ 
grams  (and  hence  mean  time  between  failures)  under  this  limiting  condition.  The  remainder  of  the  paper  treats 
the  consequences  of  failures.  Each  failure  results  in  a  cost,  represented  by  a  random  variable  with  a  distribution 
typical  of  the  type  of  failure.  The  quantity  of  interest  is  the  total  cost  of  running  the  program  for  a  time  t,  and  a 
simple  approximation  distribution  is  given  for  large  t.  The  parameters  of  this  limiting  distribution  are  functions 
only  of  the  means  and  variances  of  the  underlying  distributions,  and  thus  are  readily  estimable.  A  calculation  of 
program  availability  is  given  as  an  example  of  the  cost  process.  There  follows  a  brief  discussion  of  methods  of 
estimating  the  parameters  of  the  model,  with  suggestions  of  areas  in  which  it  might  be  useful. 

[Lltt80a]  Introduction:  It  is  instructive  to  look  at  some  of  the  reasons  advanced  by  software  developers  for  their 
reluctance  to  use  software  reliability  measurement  tools.  Here  are  a  few  common  ones: 

1.  “Software  reliability  models  are  statistical.  Programs  are  deterministic.  If  certain  input  conditions  cause  a  mal¬ 
function  today,  then  the  same  conditions  are  certain  to  cause  a  malfunction  if  they  occur  tomorrow.  Where  is 
the  randomness?” 

2.  “I  am  paid  to  write  reliable  programs.  I  use  the  best  programming  methodology  to  achieve  this.  Software  relia¬ 
bility  estimation  procedures  would  not  help  me  to  improve  the  reliability  of  my  programs.” 

3.  “We  verify  our  software.  When  it  leaves  us  it  is  correct.” 

4.  “I  ran  your  software  reliability  measurement  program  on  some  data  from  a  current  project  of  ours.  It  said 
there  was  an  infinite  number  of  bugs  left  in  the  program.  Who  are  you  trying  to  kid?” 

5.  (same  manager  as  D,  but  one  week  later)  “We  corrected  a  couple  of  bugs  and  ran  the  reliability  measurement 
program  again.  This  time  it  said  that  there  were  over  200  bugs  left.  Infinity  minus  two  equals  two  hundred?  Is 
this  the  new  math?” 

6.  “We  put  a  lot  of  effort  into  testing.  The  selection  of  test  data  is  a  systematic  process  designed  to  seek  out  bugs. 
Reliability  estimation  based  on  such  test  data  would  be  no  guide  to  the  performance  of  the  program  in  a  use 
environment.” 

7.  “We  are  writing  an  air  traffic  control  program.  Total  system  crash  would  be  catastrophic.  Other  failures  range 
from  serious  to  trivial.  Reliability  models  do  not  distinguish  between  failures  of  differing  severity.” 

Although  [the  author  has]  been  involved  in  software  reliability  modeling  for  the  past  decade,  and  [has  him¬ 
self]  perpetrated  a  few  models,  [he  has  ]  a  great  deal  of  sympathy  with  some  of  the  sentiments  expressed  above. 
[The  author  has]  a  feeling  that  some  of  the  early  models  have  been  oversold,  that  not  enough  emphasis  has  been 
placed  on  the  underlying  modeling  assumptions,  and  that  by  concentrating  on  a  simple  reliability  analysis  we 
might  be  ignoring  wider  concerns.  In  this  paper  [the  author]  shall  be  looking  at  one  common  deficiency  of  early 
models  anu  suggesting  a  way  in  which  it  can  be  overcome.  [The  author  hopes]  that,  in  passing,  some  new  insight 
into  the  wider  issues  will  be  gained. 

[Litt80b]  Abstract:  An  examination  of  the  assumptions  used  in  early  bug  counting  models  of  software  reliability 
shows  them  to  be  deficient.  Suggestions  are  made  to  improve  modeling  assumptions  and  examples  are  given  of 
mathematical  implementations.  Model  verification  via  real-life  data  is  discussed  and  minimum  requirements  are 
presented.  An  example  shows  how  these  requirements  may  be  satisfied  in  practice.  It  is  suggested  that  current 
theories  are  only  the  first  step  along  what  threatens  to  be  a  long  road. 
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[Littfla]  Abstract:  An  assumption  commonly  made  in  early  models  of  software  reliability  is  that  the  failure  rate 
of  a  program  is  a  constant  multiple  of  the  (untaown)  number  of  faults  remaining.  This  implies  that  all  faults  con¬ 
tribute  the  same  amount  to  the  overall  failure  rate  of  the  program.  The  assumption  is  challenged  and  an  alterna¬ 
tive  proposed.  The  suggested  model  results  in  earlier  fault  fixes  having  a  greater  effect  than  later  ones  (the  faults 
which  make  the  greatest  contribution  to  the  overall  failure  rate  tend  to  show  themselves  earlier,  and  so  are  fixed 
earlier),  and  the  DFR  property  between  fault  fixes  (assurance  about  programs  increases  during  periods  of  failure- 
free  operation,  as  well  as  at  fault  fixes).  The  model  is  tractable  and  allows  a  variety  of  reliability  measures  to  be 
calculated.  Predictions  of  total  execution  time  to  achieve  a  target  reliability,  and  total  number  of  fault  fixes  to  tar¬ 
get  reliability,  are  obtained.  The  model  might  also  apply  to  hardware  reliability  growth  resulting  from  the  elimina¬ 
tion  of  design  errors. 

[Lohs84]  Overview:  The  importance  of  the  scientific  investigations  of  software  design  principles  is  discussed, 
and  an  experimental  investigation  of  the  importance  of  the  design  principle  of  module  coupling  is  described.  One 
important  dimension  of  coupling,  as  promoted  by  the  authors  of  the  structured  design  methodology,  is  that  of 
global  variable  vs.  parameterized  methods  of  intermodule  communications.  It  is  shown  that  different  proposed 
software  metrics  provide  conflicting  conclusions  as  to  the  preferred  method  of  intermodule  communication.  The 
three  experiments  reported  herein  were  performed  in  university  software  engineering  courses  taken  by  graduate 
students  and  upper  level  undergraduate  majors  in  computer  science.  They  address  the  effect  of  global  vs. 
parameterized  interfaces  on  system  modifiability.  While  the  type  of  modification  being  performed  significantly 
influenced  the  modifiability  of  the  system,  there  were  no  consistent  effects  due  to  the  type  of  coupling  present  in 
the  system. 

[Lond75]  Abstract:  One  person’s  perspective  of  program  verification  and  its  relation  to  some  aspects  of  reliable 
software  are  presented.  The  main  verification  method  of  inductive  assertions  is  illustrated  with  several  variations 
of  one  detailed  example;  a  second  example  shows  a  surprisingly  simple  inductive  assertion  proof  of  an  iterative 
tree  traversal  example.  Briefly  discussed  also  are  the  implicit  assumptions  of  most  verifications,  proving  termi¬ 
nation,  the  creating  of  assertions,  and  languages  in  which  to  write  assertions.  An  abstract  overview  is  given  of 
existing  program  verification  systems  together  with  a  sample  list  of  verified  programs.  A  short  bibliography  is 
included. 

[Lond85]  Abbreviated  Introduction:  The  availability  of  today’s  powerful  personal  workstations  with  high-resolu¬ 
tion  bit-map  displays  and  pointing  devices  makes  possible  the  creation  and  display  of  drawings  containing  a  wide 
assortment  of  characters,  fonts,  icons,  and  figures,  all  of  which  can  be  continuously  moved  for  realistic  anima¬ 
tion.  We  are  currently  involved  in  using  such  animation  to  visualize  programs  and  algorithms  by  creating  graphi¬ 
cal  snapshots  and  movies  correlated  with  the  programs’  actions.  Such  a  facility  we  hope  will  provide  program¬ 
mers  or  computer  users  in  general  with  an  understanding  of  what  the  programs  do,  how  they  work,  and  why  they 
work.  It  also  will  give  users  visual  feedback  as  a  program  and  its  parts  are  being  executed.  This  animation  system 
will  provide  pictorial  representations  of  those  data  structures,  at  the  proper  level  of  abstraction,  which  are  used 
by  a  program.  Standard  representations  of  internal  data  structures,  such  as  linked  lists  or  arrays  with  separate 
index  variables,  are  often  insufficient  because  the  viewer  must  mentally  transcribe  such  representations  to  the 
abstractions  involved  in  the  use  of  those  structures.  We  use  the  type  of  diagrams  or  sketches  a  programmer  draws 
at  a  desk  or  wallboard,  or  the  kinds  of  schematic  figures  found  in  a  programming  or  data  structures  text;  for¬ 
tunately,  we  do  not  need  pictures  with  exquisite  shadings  that  re-create  photographs.  Such  figures  change  to 
reflect  the  changes  during  the  execution  of  the  program.  People’s  apparent  tendency  to  understand  by  visualizing 
spatially  the  abstractions  that  constitute  the  intention,  or  “meaning,”  of  a  program  is  exploited  by  the  system. 

[Long77]  Abstract:  The  power  industry  is  becoming  increasingly  interested  in  the  use  of  digital  computers 
within  nuclear  plant  protection  of  systems  in  order  to  satisfy  increased  safety  requirements,  provide  greater 
operating  flexibility,  minimize  spurious  forced  outages,  and  (in  conjunction  with  multiplexing)  to  meet  separation 
requirements.  However,  the  development  and  licensing  of  these  digital  safety  systems  has  been  hindered  to  date 
by  the  difficulty  of  validing  software. 
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This  paper  reviews  the  rationale  for  safety  system  software  validation  requirements.  A  survey  of  current 
methodologies  for  the  development  of  software  for  nuclear  power  plant  safety  protection  system  and  their  asso¬ 
ciated  limitations  are  provided. 

A  methodology  is  then  proposed  for  the  development  and  validation  of  nuclear  power  plant  safety  system 
software  which  may  permit  a  quantitative  assessment  of  its  correctness.  The  main  features  of  the  methodology 
are:  1)  formal  specification  and  documentation  procedures  coupled  with  strict  software  development  restric¬ 
tions,  and  2)  comprehensive  testing  and  program  analysis  used  in  conjunction  with  symbolic  execution  and 
theorem  proving  techniques  to  establish  correctness.  Adoption  of  multiple  specification  and  dual  programming 
teams  in  this  methodology  introduces  a  redundancy  for  easy  detection  of  major  design  and  programming  errors. 
The  latter  also  significantly  reduces  the  amount  of  testing  effort. 

[Long88]  Abbreviation:  (This  paper]  presents  a  representation  for  concurrent  systems,  called  a  task  interaction 
graph,  that  facilitates  analysis  [of  the  reliability  of  concurrent  systems].  Our  representation  is  an  extension  of  the 
work  of  Taylor.  We  have  been  developing  a  model  of  interacting  tasks  that  may  considerably  reduce  the  number 
of  states  in  concurrency  graph  representation.  We  call  this  representation  a  Task  Interaction  Concurrency  Graph 
(TICG),  since  it  is  derived  from  a  Task  Interaction  Graph  (TIG)  instead  of  a  control  flow  representation. 

The  TICG  and  TIG  models  have  been  designed  to  capture  the  rendezvous-like  synchronization  found  in 
languages  like  Ada,  Distributed  Processes,  and  CSP.  Task  interaction  graphs  represent  tasks  as  sets  of  regions 
and  interactions  between  regions.  To  date  we  have  developed  rules  for  translating  most  of  the  constructs  sup¬ 
ported  by  Ada  into  the  appropriate  TIG  representation.  We  have  been  investigating  several  kinds  of  analysis 
techniques  that  can  be  applied  to  the  TIG  and  TICG  models.  Deadlock  detection  and  dangerous  parallelism  are 
just  two  examples  of  the  kinds  of  analysis  that  can  be  performed  using  a  TICG  representation.  We  are  particu¬ 
larly  interested  in  investigating  the  extension  of  error  sensitive  testing  techniques  to  concurrent,  real-time  sys¬ 
tems. 

[Love76]  Abstract:  Recent  work  in  the  field  of  Software  Physics  has  produced  several  hypotheses  relating  the 
nature  of  algorithms  to  measurable  properties  on  computer  programs.  One  hypothesis  is  that  Halstead’s  meas¬ 
ure  of  E,  the  number  of  elementary  mental  discriminations  required  to  implement  an  algorithm  is  strongly 
related  to  measurable  properties  of  computer  programs.  Several  experiments  have  shown  a  surprising  high 
correlation  between  E  and  such  measurable  properties  of  programs  as  number  of  bugs,  coding  times,  etc.  This 
paper  will  present  the  results  of  an  independent  study  to  test  this  hypothesis. 

[Love77b]  Abstract:  A  within-subjects  experimental  design  was  used  to  test  the  effect  of  two  variables  on  pro¬ 
gram  understanding.  The  independent  variables  were  complexity  of  control  flow  and  paragraphing  of  the  source 
code.  Understanding  was  measured  by  having  the  subjects  memorize  the  code  for  a  fixed  time  and  reconstruct 
the  code  verbatim.  Also  some  subjects  were  asked  to  describe  the  function  of  the  program  after  completing  their 
reconstruction.  The  two  groups  of  subjects  for  the  experiment  were  students  from  an  introductory  programming 
class  and  from  a  graduate  class  in  programming  languages. 

The  major  findings  were  that  paragraphing  of  the  source  had  no  effect  for  either  group  of  subjects  but  that 
programs  with  simplified  control  flow  were  easier  for  the  computer  science  students  to  understand  as  measured 
by  their  ability  to  reconstruct  the  programs.  The  dependent  variable,  rated  accuracy  of  their  description  of  the 
programs  functions,  did  not  differ  as  a  function  of  either  independent  variable. 

The  paper  is  concluded  with  a  description  of  the  utility  of  this  experimental  approach  relative  to  improv¬ 
ing  the  reliability  of  software  and  a  discussion  of  the  importance  of  these  findings. 

[Luck77]  Abstract:  Emphasis  is  placed  on  the  practical  problems  encounteied  in  designing  automatic  program 
verifiers  and  using  them  as  an  aid  to  programming.  The  paper  includes  an  on-line  interactive  demonstration  of  a 
verifier  and  a  short  survey  of  the  kinds  of  programs  that  have  been  verified  so  far. 

[Luck79a]  Abstract:  A  practical  method  is  presented  for  automating  in  a  uniform  way  the  verification  of  Pascal 
programs  that  operate  on  the  standard  Pascal  data  structures  Array,  Record,  and  Pointer.  New  assertion 
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language  primitives  are  introduced  for  describing  computational  effects  of  operations  on  these  data  structures. 
Axioms  defining  the  semantics  of  the  new  primitives  are  given.  Proof  rules  for  standard  Pascal  operations  on 
data  structures  are  then  defined  using  the  extended  assertion  language.  An  axiomatic  rule  for  the  Pascal  storage 
allocation  operation,  NEW,  is  also  given.  These  rules  have  been  implemented  in  the  Stanford  Pascal  program 
verifier.  Examples  illustrating  the  verification  of  programs  which  operate  on  list  structures  implemented  with 
pointers  and  records  are  discussed.  These  include  programs  with  side  effects. 

[Luck80a]  Abstract:  We  present  a  method  of  formal  specification  of  Ada  programs  containing  packages.  The 
method  suggests  concepts  and  guidelines  useful  for  giving  adequate  informal  documentation  of  packages  by 
means  of  comments. 

The  method  for  depends  on  (1)  the  standard  inductive  assertion  technique  for  subprograms,  (2)  the  use  of 
history  sequences  in  assertions  specifying  the  declaration  and  use  of  packages,  and  (3)  the  addition  of  three 
categories  of  specifications  to  Ada  package  declarations:  (a)  visible  specifications,  (b)  boundary  specifications, 
(c)  internal  specifications. 

Axioms  and  proof  rules  for  the  Ada  package  constructs  (declaration,  instantiation,  and  function  and  pro¬ 
cedure  call)  are  given  in  terms  of  history  sequence  and  package  specifications.  These  enable  us  to  construct  for¬ 
mal  proofs  of  the  correctness  of  Ada  programs  with  packages.  The  axioms  and  proof  rales  are  easy  to  imple¬ 
ment  in  automated  program  checking  systems.  The  use  of  history  sequences  in  both  informal  documentation 
and  formal  specifications  and  proofs  is  illustrated  by  examples. 

[Luck80b]  Abstract:  A  method  of  documenting  exception  propagation  and  handling  in  Ada  programs  is  pro¬ 
posed.  Exception  propagation  declarations  are  introduced  as  a  new  component  of  Ada  specifications,  permitting 
documentation  of  those  exceptions  that  can  be  propagated  by  a  subprogram.  Exception  handlers  are  docu¬ 
mented  by  entry  assertions.  Axioms  and  proof  rules  for  Ada  exceptions  are  given.  These  rales  are  simple  exten¬ 
sions  of  previous  rules  for  Pascal  and  define  an  axiomatic  semantics  of  Ada  exceptions.  As  a  result,  Ada  pro¬ 
grams  specified  according  to  the  method  can  be  analyzed  by  formal  proof  techniques  for  consistency  with  their 
specifications,  even  if  they  employ  exception  propagation  and  handling  to  achieve  required  results  (i.e.,  nonerror 
situations).  Example  verifications  are  given. 

[Luck84a]  Abstract:  A  specification  language  permits  information  about  various  aspects  of  a  program  to  be 
expressed  in  a  precise  machine  processable  form.  This  information  is  not  normally  part  of  the  program  itself. 

Specification  languages  are  viewed  as  evolving  from  modern  high  level  programming  languages.  This  first 
step  in  this  evolution  is  cautious  extensions  of  the  programming  language.  Some  of  the  features  of  Anna,  a 
specification  language  extending  Ada,  are  discussed.  The  extensions  include  generalizations  of  constructs  (such 
as  type  constraints)  that  are  already  in  Ada,  and  new  constructs  for  specifying  subprograms,  packages,  excep¬ 
tions,  and  contexts. 

Anna  has  been  designed  in  collaboration  with  B.  Krieg-Brueckner  and  O.  Owe. 

[Luck85]  Abbreviated  Introduction:  ANNA  is  a  proposal  for  a  specification  language,  or  rather  a  language  in 
which  one  might  experiment  with  specification  languages.  The  work  was  begun  by  Bemd  Krieg-Brueckner  and 
myself,  and  subsequent  collaborators  have  been  O.  Owe  from  Oslo,  who  worked  on  the  axiomatic  semantics, 
and  Friedrich  von  Henke  who  worked  on  the  language  reference  manual  and  redesigned  some  of  the  finer  points 
of  the  language.  S.  Sankar,  D.  Rosenblum,  R.  Neff,  and  D.  Bryan  are  currently  implementing  various  prototype 
tools  for  experimentation. 

ANNA  is  an  syntactic  extension  of  ADA:  it  takes  a  subset  of  ADA  productions  and  adds  more.  The 
ANNA  specifications  appear  as  formal  ADA  comments.  This  means  ANNA  comments  can  be  processed  by  a 
standard  ADA  tool,  which  will  simply  ignore  them,  and  also  by  special  ANNA  tools. 

All  proposed  ANNA  tools  use  an  extension  of  DIANA,  and  therefore  can  be  interfaced  easily  with  other 
tools  in  an  ADA  environment. 

ANNA  can  be  used  for  comparative  testing.  Comparative  testing  means  comparing  the  ADA  code 
against  its  formal  specifications  for  consistency.  Self-checking  programs  are  ones  which  leave  the  runtime  checks 
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compiled  from  the  forma]  specifications  in  the  program  permanently. 

[Lnck86a]  Abstract:  This  report  gives  an  overview  of  the  current  status  and  plans  to  construct  a  prototype 
environment  of  advanced  tools  for  software  and  hardware  development  based  on  the  use  of  wide-spectrum 
languages.  The  wide-spectrum  languages  include  Anna  (ANNotated  Ada),  and  TSL  (Task  Sequencing 
Language).  The  tools  described  here  provide  interactive  aid  at  all  stages  in  the  system  development  process.  Spe¬ 
cial  emphasis  is  placed  on  distributed  computing,  both  in  providing  tools  for  handling  parallelism  in  the  subject 
system,  and  in  designing  tools  that  utilize  parallelism  in  die  programming  environment.  Applications  of  these 
tools  include  requirements  analysis,  formal  specification,  rapid  prototyping,  testing,  formal  verification  and  con¬ 
struction  of  self-testing  Ada  software  for  multi-processor  systems. 

The  report  describes  an  existing  environment  of  prototype  tools  supporting  applications  of  Anna  and  TSL 
to  formal  specification  and  testing  of  Ada  software.  The  new  environment  tools  will  be  based  on  component 
tools  already  developed  at  Stanford  and  proven  to  be  portable  to  various  Ada  environments.  All  tools  are  imple¬ 
mented  in  Ada  and  are  intended  to  interface  with  standard  components  of  Ada  programming  environments. 

[Luck87]  Abstract:  TSL-1  is  a  language  for  specifying  sequences  of  tasking  events  occurring  in  the  execution  of 
distributed  Ada  programs.  Such  specifications  are  intended  primarily  for  testing  and  debugging  of  Ada  tasking 
programs,  although  they  can  also  be  applied  in  designing  programs.  TSL-1  specifications  are  included  in  an  Ada 
program  as  formal  comments.  They  express  constraints  to  be  satisfied  by  the  sequences  of  actual  tasking  events. 
An  Ada  program  is  consistent  with  its  TSL-1  specifications  if  its  runtime  behavior  always  satisfies  them.  This 
paper  presents  an  overview  of  TSL-1.  The  features  of  the  language  are  described  informally,  and  examples  illus¬ 
trating  the  use  of  TSL-1,  both  for  debugging  and  specification  of  tasking  programs;  are  given.  A  definition  of 
robust  TSL-1  specifications  that  takes  into  account  uncertainty  in  runtime  observation  of  behavior  of  distributed 
systems  is  given.  A  runtime  monitor  for  checking  consistency  of  an  Ada  program  with  TSL-1  specifications  has 
been  implemented.  In  the  future,  constructs  for  defining  abstract  tasks  will  be  added  to  TSL-1,  forming  a  new 
language,  TSL-2,  for  the  specification  of  distributed  systems  prior  to  their  implementation  in  any  particular  pro¬ 
gramming  language. 

[MIL85]  Scope:  This  standard  prescribes  the  requirements  for  the  conduct  of  Technical  Reviews  and  Audits  on 
Systems,  Equipments,  and  Computer  Software. 

The  following  technical  reviews  and  audits  shall  be  selected  by  the  program  manager  at  the  appropriate 
phase  of  program  development.  Each  review/audit  is  generally  described  in  Section  3,  Definitions,  and  more 
specifically  defined  in  a  separate  appendix. 

System  Requirements  Review  (SRR) 

System  Design  Review  (SDR) 

Software  Specification  Review  (SSR) 

Preliminary  Design  Review  (PDR) 

Critical  Design  Review  (CDR) 

Test  Readiness  Review  (TRR) 

Functional  Configuration  Audit  (FCA) 

Physical  Configuration  Audit  (PCA) 

Formal  Qualification  Review  (FQR) 

Production  Readiness  Review  (PRR) 

Technical  Reviews  and  Audits  defined  herein  shall  be  conducted  in  accordance  with  this  standard  to  the 
extent  specified  in  the  contract  clauses,  Statement  of  Work  (SOW),  and  the  Contract  Data  Requirements  List. 
Guidance  in  applying  this  standard  is  provided  in  Appendix  J.  The  contracting  agency  shall  tailor  this  standard  to 
require  only  what  is  needed  for  each  individual  acquisition. 

[Majo83]  Abstract:  In  this  paper  an  automated  method  for  testing  programs  against  a  formal  specification  is 
presented.  The  method  is  based  on  the  view  of  a  software  system  as  a  network  of  modules  and  data  capsules 
which  are  connected  via  data  flows.  Modules  are  specified  in  terms  of  their  pre-  and  postconditions,  data 
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capsules  are  specified  in  terms  of  their  usage  as  input  and  output.  For  this  purpose  a  special  assertion  language  is 
employed.  The  test  procedures  derived  from  the  language  serve  to  simulate  data  capsules  and  modules  when  test¬ 
ing,  generating  arguments  and  validating  results. 

[Mand85]  Abstract:  Multitasking  is  one  of  the  most  novel  aspects  of  Ada.  However,  the  combination  of 
language  primitives  for  concurrent  execution  of  tasks,  synchronization,  termination,  abortion,  exception  han¬ 
dling,  etc.  make  Ada  programs  difficult  to  understand  and  analyze.  This  is  partly  due  to  the  inherent  complexity 
of  the  language  and  partly  due  to  the  lack  of  a  rigorous  definition  of  its  semantics.  The  Ada  Reference  Manual 
describes  semantics  in  informal  English  prose;  as  a  result,  it  is  often  verbose  and  ambiguous. 

The  goal  of  this  paper  is  not  to  provide  a  complete  formal  semantics  of  Ada  multitasking.  Rather,  we  illus¬ 
trate  the  use  of  a  semi-formal  approach  based  on  (timed)  Petri  nets  which  support  a  rigorous  description  of  the 
language.  The  approach  is  described  by  stepwise  refinements  and  is  used  to  describe  several  cases  of  task  interac¬ 
tions,  ranging  from  simple  to  complex  ones.  The  proposed  approach  can  easily  be  applied  in  the  description  of 
other  multitasking  problems  not  covered  in  this  paper. 

[Mann70]  Abstract:  The  problem  of  convergence,  correctness,  and  equivalence  of  computer  programs  can  be 
formulated  by  means  of  the  satisfiability  or  validity  of  certain  first-order  formulas.  An  algorithm  is  presented  for 
constructing  such  formulas  for  functional  programs,  i.e.  programs  defined  by  Lisp-like  conditional  recursive 
expressions. 

[Mann74]  Contents:  The  chapters  of  this  book  discuss  the  following  topics.  Computability:  Finite  automata; 
Turing  machines;  Turing  machines  as  acceptors;  Turing  machines  as  Generators;  Turing  machines  as  algo¬ 
rithms.  Predicate  Calculus:  Basic  notions;  Natural  deduction;  The  resolution  method.  Verification  of  Programs: 
Flowchart  programs;  Flowchart  programs  with  arrays;  Algol-like  programs.  Flowchart  Schemas:  Basic  notions; 
Decision  problems;  Formalization  in  predicate  calculus;  Translation  problems.  The  Fixpoint  Theory  of  Pro¬ 
grams:  Functions  and  functionals;  Recursive  programs;  Verification  methods. 

[Mann78]  Abstract:  This  paper  explores  a  technique  for  proving  the  correctness  and  termination  of  programs 
simultaneously.  This  approach,  the  intermittent  assertion  method,  involves  documenting  the  program  with  asser¬ 
tions  that  must  be  true  at  some  time  when  control  passes  through  the  corresponding  point,  but  that  need  not  be 
true  every  time.  The  method,  introduced  by  Burstall,  promises  to  provide  a  valuable  complement  to  the  more 
conventional  methods. 

The  intermittent-assertion  method  is  presented  with  a  number  of  examples  of  correctness  and  termination 
proofs.  Some  of  these  proofs  are  markedly  simpler  than  their  conventional  counterparts.  On  the  other  hand,  it  is 
shown  that  a  proof  of  correctness  or  termination  by  any  of  the  conventional  techniques  can  be  rephrased  directly 
as  a  proof  using  intermittent  assertions.  Finally,  it  is  shown  how  the  intermittent-assertion  method  can  be  applied 
to  prove  the  validity  of  program  transformations  and  correctness  of  continuously  operating  programs. 

[Math87a]  Abstract:  Several  techniques  have  been  developed  in  the  past  for  improving  the  vectorization  level  of 
a  program  for  fast  execution  on  a  vector  processor  like  the  Cray  X/MP,  Cyber  205  or  Alliant  FX/8.  Most  of  these 
techniques  are  generally  embedded  in  the  language  compilers  of  the  vector  machine,  thereby  making  it  easier  for 
the  programmer  to  benefit  from  them. 

[Math87b]  Abstract:  In  [Math87a]  a  program  transformation  technique  is  presented  that  aids  in  inducing  vectori¬ 
zation  in  a  given  program  P.  This  technique  has  applications  in  several  areas  including  software  testing  using 
mutation  analysis  and  in  scheduling  computations  in  an  arbitrary  program  on  an  SIMD  machine. 

In  this  paper  we  provide  a  formulation  of  this  transformation  technique.  The  technique  itself  can  be  used 
to  transform  a  given  program  P,  desired  to  be  executed  on  N  data  sets,  to  another  program  VP.  Instead  of  execut¬ 
ing  P  sequentially  over  the  N  data  sets,  VP  executes  concurrently  over  all  the  N  data  sets.  The  transformation 
rules  are  such  that  even  though  P  may  not  yield  well  to  vectorization,  VP  will. 
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[Math88a]  Abstract:  In  this  paper  we  describe  a  new  methodology  to  merge  a  large  number  of  program  mutants 
into  a  small  set  of  highly  vectorizable  programs.  Mutants  are  generated  when  using  the  mutation  analysis  software 
testing  method.  Although  mutation  analysis  is  a  simple  and  effective  software  testing  methodology,  it  is  computa¬ 
tionally  intensive.  This  becomes  a  major  factor  to  be  reckoned  with  when  testing  large  programs. 

The  technique  described  in  this  paper  enables  a  tester  to  exploit  the  architecture  of  vector  processors,  like 
the  Cray  X-MP  or  the  Alliant  FX/8  for  efficient  execution  of  all  the  mutants. 

An  analysis  of  the  technique  is  presented  elsewhere.  The  analysis  aids  a  tester  is  estimating  in  advance  the 
speed  up  that  can  be  expected  if  program  unification  is  employed.  The  speed  up  compares  the  time  to  execute  a 
unified  set  of  mutants  with  the  time  to  execute  the  same  set  of  mutants  serially. 

(Maug85]  Abstract:  In  this  paper,  we  present  the  main  concepts  used  in  our  symbolic  debugger  for  Ada. 
Described  also  is  a  companion  tool,  the  Ada  Program  VEEWer,  which  gives  users  full  access  to  program  source 
while  debugging.  This  debugger  is  one  of  the  components  of  the  Alsys  tool  set  which  aims  at  providing  high-level 
Ada-oriented  tools,  incorporating  state-of-the-art  techniques  for  software  design,  documentation,  and  develop¬ 
ment. 

[Mayf85]  Abstract:  The  Second  Workshop  identified  current  issues  in  Ada  Verification  and  focused  on  what  is 
needed  to  build  the  foundation  of  an  Ada  Verification  Technology.  IDA  Workshops  will  continue  to  be  a  meeting 
place  for  accessing  the  current  state-of-the-art,  identifying  promising  research  areas,  monitoring  ongoing  verifi¬ 
cation  work,  promoting  the  use  of  the  evolving  technology,  and  ensuring  that  valuable  outputs  from  one  area  are 
fed  into  other  areas.  The  desired  product  of  these  workshops  will  be  recommendations  to  various  bodies  to  coor¬ 
dinate  and  sponsor  certain  R&D  activities.  Working  groups  on  special  topics  were  also  established. 

[Mayf86]  Abstract:  The  Third  IDA  Workshop  was  conducted  at  the  Research  Triangle  Institute,  Research  Tri¬ 
angle  Park,  North  Carolina  on  May  14-15, 1986.  The  theme  of  the  workshop  was  “Reaching  Verifiable  Ada  Sys¬ 
tems  by  1990”  and  addressed  the  following: 

1.  Advances  in  verification  technology 

2.  Adaptation  of  current  technology  in  Ada  verification  systems  and  methods 

3.  Broadening  the  base  of  support  for  work  in  Ada  verification 

4.  Encouraging  the  participation  by  larger  segments  of  both  the  Ada  and  the  verification  communities. 

A  detailed  exposition  of  the  Ada  formal  definition  being  developed  by  the  European  Economic  Commun¬ 
ity  was  presented.  This  exposition  took  the  form  of  a  series  of  tutorial  presentations  (enclosed  in  this  document) 
on  various  aspects  of  the  dynamic  and  static  semantics  of  the  definition  and  its  underlying  formalisms.  Dr.  Har¬ 
lan  Mills  from  IBM’s  Federal  Systems  Division  was  the  keynote  speaker. 

[McCa76]  Abstract:  This  paper  describes  a  graph-theoretic  complexity  measure  and  illustrates  how  it  can  be 
used  to  manage  and  control  program  complexity.  The  paper  first  explains  how  the  graph-theory  concepts  apply 
and  gives  an  intuitive  explanation  of  the  graph  concepts  in  programming  terms.  The  control  graphs  of  several 
actual  Fortran  program  are  then  presented  to  illustrate  the  correlation  between  intuitive  complexity  and  the 
graph-theoretic  complexity.  Several  properties  of  the  graph-theoretic  complexity  are  then  proved  which  show,  for 
example,  that  complexity  is  independent  of  physical  size  (adding  or  subtracting  functional  statements  leaves 
complexity  unchanged)  and  the  complexity  depends  only  on  the  decision  structure  of  a  program. 

The  issue  of  using  nonstructured  control  graphs  is  given  and  a  method  of  measuring  the  “structuredness” 
of  a  program  is  developed.  The  relationship  between  structure  and  reducibility  is  illustrated  with  several  exam¬ 
ples. 

The  last  section  of  this  paper  deal  with  a  testing  methodology  used  in  conjunction  with  the  complexity 
measure;  a  testing  strategy  is  defined  that  dictates  that  a  program  can  either  admit  of  a  certain  minimal  testing 
level  or  the  program  can  be  structurally  reduced. 

[McCa77a]  Abbreviated  Preface:  The  objective  of  the  study  was  to  establish  a  concept  of  software  quality  and 
provide  an  Air  Force  acquisition  manager  with  a  mechanism  to  quantitatively  specify  and  measure  the  desired 
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level  of  quality  in  a  software  product.  Software  metrics  provide  the  mechanism  for  the  quantitative  specification 
and  measurement  of  quality. 

The  first  volume  describes  the  process  of  developing  our  concept  of  software  quality  and  what  the  underly¬ 
ing  software  attributes  are  that  provide  the  quality,  and  defines  the  metrics  which  provide  a  measure  of  the  degree 
to  which  the  attributes  exist. 

The  second  volume  describes  the  application  of  the  metrics  to  software  products  and  the  validation  of  the 
metrics’  relationship  to  software  quality. 

The  third  volume  is  a  preliminary  stand-alone  reference  document  to  be  used  by  an  acquisition  manager  to 
implement  the  techniques  established  during  the  study. 

[McCa79]  Abbreviated  Introduction:  These  high  costs  [for  life  cycle  management  of  large-scale  software  sys¬ 
tems]  result  from  characteristics  of  software  that  do  not  necessary  relate  to  the  correctness  of  the  implementa¬ 
tion  of  a  function  or  how  reliably  the  function  operates,  but  instead  relate  to  “how  well”  the  software  is  designed, 
coded,  and  documented  with  respect  to  maintaining,  transferring,  modifying,  etc. ,  the  software.  This  “how  well” 
is  a  major  aspect  of  software  quality.  This  situation  identifies  a  weakness  in  how  the  requirements  of  software  sys¬ 
tem  developments  are  defined  currently.  Emphasis  is  placed  on  the  functions  that  must  be  performed,  the 
schedule  in  which  the  system  must  be  produced,  and  the  cost  of  producing  the  system.  Little  or  no  attention  is 
given  to  identifying  what  qualities  over  the  life  cycle  the  software  system  should  exemplify.  There  are  two  major 
reasons  for  this  focus.  First,  the  initial  operation  of  the  system,  how  correctly  and  reliably  the  system  performs, 
is  always  important  to  the  sponsor  of  a  development.  It  provides  the  first  test  of  not  only  how  well  the  developer 
has  done,  but  also  how  well  the  sponsor  has  done  in  specifying,  monitoring,  and  controlling  the  development. 
(Cost  and  schedule  are  obvious  concerns  since  the  system  usually  must  be  developed  in  a  constrained  period  of 
time  and  within  a  constrained  budget.)  Second,  no  standard  definition  or  identification  of  what  qualities  the 
acquisition  manager  should  consider  has  been  available.  No  mechanism  has  existed  which  would  allow  an 
acquisition  manager  to  quantitatively  specify  the  qualify  desired  and  then  measure  how  well  the  development  was 
progressing  toward  the  desired  quality.  The  little  consideration  given  quality  to  date  generally  has  been  very  sub¬ 
jective  and  not  followed  up  by  measurement  or  assurance  activities. 

The  potential  life  cycle  cost  savings  of  standardized  concept  of  software  quality  and  a  mechanism  for 
specifying  and  measuring  software  quality  are  substantial  considering  the  large  portion  of  life  cycle  costs  attri¬ 
buted  to  the  qualities  mentioned  previously.  The  subject  of  this  chapter  and  the  next  is  a  concept  of  software 
quality  metrics  and  their  application  in  a  quality  management  program. 

[McCa80a]  Abstract:  Software  metrics  (or  measurements)  which  predict  software  quality  have  been  refined  and 
enhanced.  Metrics  were  classified  as  anomaly-detecting  metrics  which  identify  deficiencies  in  documentation  or 
source  code,  predictive  metrics  which  measure  the  logic  of  the  design  and  implementation,  and  acceptance 
metrics  which  are  applied  to  the  end  product  to  assess  compliance  with  requirements. 

A  Software  Quality  Measurement  Manual  was  produced  which  contained  procedures  and  guidelines  for 
assisting  software  system  developers  in  setting  quality  goals,  applying  metrics  and  making  quality  assessments. 

[McCa80b]  Abstract:  Software  metrics  (or  measurements)  which  predict  software  quality  have  been  refined  and 
enhanced.  Metrics  were  classified  as  anomaly-detecting  metrics  which  identify  deficiencies  in  documentation  or 
source  code,  predictive  metrics  which  measure  the  logic  of  the  design  and  implementation,  and  acceptance 
metrics  which  are  applied  to  the  end  product  to  assess  compliance  with  requirements. 

A  Software  Quality  Measurement  Manual  was  produced  which  contained  procedures  and  guidelines  for 
assisting  software  system  developers  in  setting  quality  goals,  applying  metrics  and  making  quality  assessments. 

[McCa82a]  Abstract:  This  Guideline  presents  the  various  applications  of  the  Structured  Testing  methodology. 
The  core  of  this  technique  is  to  avoid  programs  that  are  inherently  untestable  by  first  measuring  and  limiting  pro¬ 
gram  complexity.  The  definition  and  development  of  a  program  complexity  measure  is  presented.  The  complexity 
measure  is  then,  in  the  second  phase  of  the  methodology,  used  to  quantify  and  proceduralize  the  testing  process. 
How  to  apply  the  techniques  to  the  maintenance  process  in  order  to  identify  the  code  that  must  be  re-tested  after 
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making  a  modification  is  illustrated. 

[McCa82b]  Abstract:  This  paper  deals  with  the  use  of  Structured  Analysis  just  prior  to  system  acceptance  test¬ 
ing.  Specifically,  the  drawing  of  Data  Flow  Diagrams  (DFDs)  was  done  after  integration  testing.  The  DFDs  pro¬ 
vided  a  picture  of  the  logical  flow  through  the  integrated  system  for  thorough  system  acceptance  testing.  System 
test  sets  were  derived  from  the  flows  in  the  DFDs.  System  test  repeatability  was  enhanced  by  the  matrix  which 
flowed  from  the  test  sets. 

[McCa87a]  Abstract:  This  report  describes  the  results  of  a  research  and  development  effort  to  develop  a  metho¬ 
dology  for  predicting  and  estimating  software  reliability.  A  Software  Reliability  Measurement  Framework  was 
established  which  spans  the  life  cycle  of  a  software  system  and  includes  the  specification,  prediction,  estimation, 
and  assessment  of  software  reliability.  Data  from  59  systems,  representing  over  5  million  lines  of  code,  were 
analyzed  and  generally  applicable  observations  about  software  reliability  were  made.  A  detailed  approach  to  the 
collection  and  analysis  of  reliability  data  is  also  presented. 

[McCa87b]  Abstract:  The  Guidebook  provides  detailed  procedures  for  the  preparation  of  software  reliability 
predictions  and  estimations  on  DOD  projects.  In  developing  the  Guidebook,  59  software  systems  were  examined 
and  19  key  variables  were  identified  that  affected  the  software  reliability  of  those  systems.  Procedures  to  measure 
these  variables  were  developed  to  account  for  the  type  of  application,  development,  environment,  various 
software  characteristics  (such  as  modularity  and  complexity),  test  technique,  test  effort  and  test  coverage.  A 
methodology  was  also  provided  to  use  these  measures  to  predict  software  fault  density  and  software  failure  rates. 

The  Guidebook  could  be  applied  by  an  Air  Force  acquisition  office  to  help  plan  for  adequate  software 
reliability  early  in  a  project’s  life,  specify  achievable  software  reliability  goals  in  a  RFP,  evaluate  progress  toward 
those  goals  at  key  project  milestones  and  decide  when  to  release  the  software.  The  Guidebook  could  also  be  used 
by  the  technical  staff  to  establish  thresholds  for  critical  measures  such  as  complexity.  In  addition,  the  Guidebook 
also  contains  Quality  Review  and  Standards  Review  Checklists  that  can  be  used  in  conjunction  with  the  software 
reliability  prediction  and  estimation  methodology.  The  Quality  Review  Checklists  are  used  to  assess  the  quality  of 
the  requirements  and  design  representation  of  the  software  while  the  Standards  Review  Checklist  would  be 
applied  to  software  code.  The  checklists  provide  good  guidance  for  ensuring  that  quality  is  built  into  the 
software. 

[McCI78a]  Introduction:  A  frequently  stated  objective  of  structured  programming  is  to  control  program  com¬ 
plexity.  However,  since  the  notion  of  complexity  is  not  well  understood  and  the  existing  techniques  for  measuring 
complexity  are  crude,  it  is  difficult  to  determine  if  indeed  structured  programming  can  achieve  this  objective. 
The  purpose  of  this  paper  is  two-fold: 

1.  to  discuss  the  probable  sources  of  complexity  in  a  well-structured  program 

2.  to  present  a  methodology  for  measuring  and  controlling  complexity  in  a  well-structured  program. 

(McC178bj  Abbreviated  Preface:  This  book  is  intended  for  the  programmer  in  the  business  community.  Its  pur¬ 
pose  is  to  present  software  methodologies  and  techniques  to  guide  the  programmer  in  developing  well-structured 
programs.  In  general,  the  methodologies  discussed  are  applicable  to  any  higher  level  programming  language 
(e.g.,  ALGOL  60,  PL/1,  COBOL)  that  provides  the  basic  constructs  required  by  structured  programming. 
Explaining  how  to  code  well-structured  programs  is  only  an  ancillary  purpose  of  the  book.  The  primary  intent  is 
to  extend  the  structured  programming  approach  to  include  the  programming  process  as  well  as  the  program 
structure.  This  is  accomplished  by: 

1.  Clarifying  the  meaning  of  basic  software  terms  such  as  structured  programming,  top-down  programming,  and 
bottom-up  programming  (see  chapter  2) 

2.  Presenting  guidelines  for  selecting  and  applying  an  appropriate  design  methodology  for  writing  a  well-struc¬ 
tured  program  (see  chapters  3  and  4) 

3  Including  program  complexity  analysis  as  an  integral  step  in  the  programming  process  (see  chapter  5). 

A  commonly  stated  objective  of  structured  programming  is  to  control  program  complexity-the  issue  being 
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that  the  more  complex  a  program,  the  more  difficult  it  becomes  to  understand  how  the  program  works.  Obvi¬ 
ously,  program  complexity  is  a  function  of  program  size;  but  also  it  is  a  function  of  the  number  of  possible  pro¬ 
gram  execution  paths  and  the  control  variables  that  direct  path  selection.  The  structured  programming  approach 
attempts  to  control  program  complexity  by  restricting  module  invocation  and,  in  general,  by  restricting  the  use  of 
control  structures.  It  does  not,  however,  limit  the  use  of  control  variables  other  than  to  suggest  that  they  be 
accessed  in  as  few  program  modules  as  possible. 

In  this  book,  the  complexity  issue  is  confronted  by  providing  a  means  of  quantitatively  measuring  program 
complexity  and  by  inserting  complexity  control  as  a  postdesign/preimplementation  step  in  the  programming  pro¬ 
cess.  In  this  way,  the  programmer  can  more  effectively  analyze  and  control  the  quality  of  a  program  as  it  is  being 
developed. 

[McGaHSb]  Abstract:  The  availability  and  quality  of  computer  resources  during  the  software  development  pro¬ 
cess  has  been  speculated  to  have  measurable,  significant  impact  on  the  efficiency  of  the  development  process  and 
the  quality  of  the  resulting  product.  Environmental  components  such  as  the  types  of  tools,  machine  responsive¬ 
ness,  and  quantity  of  direct  access  storage  may  play  a  major  role  in  the  effort  to  produce  the  product  and  in  its 
subsequent  quality  as  measured  by  factors  such  as  reliability  and  ease  of  maintenance. 

During  the  past  six  years,  the  NASA  Goddard  Space  Flight  Center  has  conducted  experiments  with 
software  projects  in  an  attempt  to  better  understand  the  impact  of  software  development  methodologies, 
environments,  and  general  technologies  on  the  software  process  and  product.  Data  has  been  extracted  and 
examined  from  nearly  SO  software  development  projects.  The  data  collection  and  analysis  has  been  performed 
jointly  by  NASA,  the  Computer  Science  Department  at  the  University  of  Maryland,  and  the  Computer  Science 
Corp.  The  projects  have  varied  in  size  from  3000  up  to  130,000  lines  of  code  with  an  average  of  60,000  lines  of 
code.  All  have  been  related  to  the  support  of  satellite  flight  dynamics  ground-based  computations.  As  a  result  of 
changing  situations  and  technology,  the  computer  support  environment  has  varied  widely.  Some  projects 
enjoyed  fast  response  time,  archaic  tool  support,  and  limited  terminal  access  to  the  development  machine. 

This  study  examined  the  relationship  between  computer  resources  and  the  software  development  process 
and  product  as  exemplified  by  the  subject  NASA  data.  Based  upon  the  results,  a  number  of  computer  resource- 
related  implications  are  provided. 

[McMu80]  Abstract:  A  compiler-based  specification  and  testing  system  for  defining  data  types  has  been 
developed.  The  system,  DAISTS  (data  abstraction  implementation,  specification,  and  testing  system)  includes 
formal  algebraic  specifications  and  statement  and  expression  test  coverage  monitors.  This  paper  describes  our 
initial  attempt  to  evaluate  the  effectiveness  of  the  system  in  helping  users  produce  software.  In  an  exploratory 
study,  subjects  without  prior  experience  with  DAISTS  were  encouraged  by  the  system  to  develop  effective  sets  of 
test  cases  for  their  implementations.  Furthermore,  an  analysis  of  the  errors  remaining  in  the  implementations 
provided  valuable  hints  about  additional  useful  testing  metrics. 

[McMu83]  Abstract:  This  paper  describes  our  experience  in  specifying,  implementing,  and  validating  a  record- 
oriented  text  editor.  Algebraic  axioms  served  as  the  specification  notation;  and  the  implementation  was  tested 
with  a  compiler-based  system  that  uses  the  axioms  to  test  implementations  with  a  finite  collection  of  test  cases. 
Formal  specifications  were  sometimes  difficult  to  produce,  but  helped  reveal  errors  during  unit  testing.  Thorough 
exercising  of  the  implementations  by  the  specifications  resulted  in  few  errors  persisting  until  integration. 

[Medi81]  Abstract:  This  paper  describes  an  incremental  programming  environment  (IPE)  based  on  compilation 
technology,  but  providing  facilities  traditionally  found  only  in  interpretive  systems.  IPE  provides  a  comfortable 
environment  for  a  single  programmer  working  on  a  single  program. 

In  IPE  the  programmer  has  a  uniform  view  of  the  program  in  terms  of  the  programming  language.  The 
program  is  manipulated  through  a  syntax-directed  editor  and  its  execution  is  controlled  by  a  debugging  facility, 
which  is  integrated  with  the  editor.  Other  tools  of  the  traditional  tools  cycle  (translator,  linker,  loader)  are 
applied  automatically  and  are  not  visible  to  the  programmer.  The  only  interface  to  the  programmer  is  the  user 
interface  of  the  editor. 


301 


August  9, 1989 


[Mell82]  Abstract:  This  paper  describes  the  formal  specification  and  proof  methodology  employed  to  demon¬ 
strate  that  the  SIFT  computer  meets  its  requirements.  The  hierarchy  of  design  specifications  is  shown,  from  very 
abstract  descriptions  of  system  function  down  to  the  implementation.  The  most  abstract  design  specifications  are 
simple  and  easy  to  understand,  almost  all  details  of  the  realization  having  been  abstracted  out,  and  can  be  used  to 
ensure  that  the  system  functions  reliably  and  as  intended.  A  succession  of  lower  level  specifications  refine  these 
specifications  into  more  detailed  and  more  complex  views  of  the  system  design,  culminating  in  the  Pascal  imple¬ 
mentation.  The  paper  describes  the  rigorous  mechanical  proof  that  the  abstract  specifications  are  satisfied  by  the 
actual  implementation. 

[Meye67J  Abbreviated  Introduction:  Anyone  familiar  with  the  theory  of  computability  will  be  aware  that  practi¬ 
cal  conclusions  from  the  theory  must  be  drawn  with  caution.  If  a  problem  can  theoretically  be  solved  by  compu¬ 
tation,  this  does  not  mean  that  it  is  practical  to  do  so.  Conversely,  if  a  problem  is  formally  undecidable,  this  does 
not  mean  that  the  subcases  of  primary  interest  are  impervious  to  solution  by  algorithmic  methods. 

The  question  of  detecting  improvable  programs  will  appear  again  later  in  this  paper,  but  our  main  concern 
will  be  with  a  related  question:  can  one  look  at  a  program  and  determine  an  upper  bound  on  its  running  time? 
Again,  a  fundamental  theorem  in  the  theory  of  computability  implies  that  this  cannot  be  done.  The  Theorem 
does  not  imply  that  one  cannot  bound  the  running  time  of  broad  categories  of  interesting  programs,  including 
programs  capable  of  computing  all  the  arithmetic  functions  one  is  likely  to  encounter  outside  the  theory  of  com¬ 
putability  itself. 

In  the  next  section  we  describe  such  a  class  of  programs,  called  “Loop  programs.”  Each  Loop  program 
consists  only  of  assignment  statements  and  iteration  (loop)  statements.  Although  Loop  programs  cannot  com¬ 
pute  all  the  computable  functions,  they  can  compute  all  the  primitive  recursive  functions. 

[Miar83]  Abstract:  The  consensus  in  the  programming  community  is  that  indentation  aids  program  comprehen¬ 
sion,  although  many  studies  do  not  back  this  up.  We  tested  program  comprehension  on  a  Pascal  program.  Two 
styles  of  indentation  were  used  -  blocked  and  nonblocked  -  in  addition  to  four  possible  levels  of  indentation 
(0,2, 4, 6  spaces).  Both  experienced  and  novice  subjects  were  used.  Although  the  blocking  style  made  no  differ¬ 
ence,  the  level  of  indentation  had  a  significant  effect  on  program  comprehension.  (2-4  spaces  had  the  highest 
mean  score  for  program  comprehension.)  We  recommend  that  a  moderate  level  of  indentation  be  used  to 
increase  program  comprehension  and  user  satisfaction. 

[MIU84]  Abstract:  Many  program  verification  methods  are  known  nowadays:  Inductive  Assertion  Method,  Sym¬ 
bolic  Execution  Method,  Subgoal  Induction  Method,  Computational  Induction  Method,  Structural  Induction 
Method,  Fixpoint  Theory  of  Programs.  This  paper  presents  a  simple  classification  of  them. 

[MU174a]  Abstract:  A  structural  basis  for  the  formulation  of  testcases  for  given  computer  programs  has  been 
found  to  be  an  effective  and  efficient  strategy.  An  existing  automated  program  validation  system  employs  these 
techniques  with  good  success  in  minimizing  the  number  of  testcases  required;  this  same  system  permits 
automatic  identification  of  testcases  in  a  high  proportion  of  instances.  Research  aimed  at  fully  automating  the 
testcase  generation  process  continues. 

[Mill75a]  Abstract:  Structured  programming  has  proved  to  be  an  important  methodology  for  systematic  program 
design  and  development.  Structured  programs  are  identified  as  compound  function  expressions  in  the  algebra  of 
functions.  The  algebraic  properties  of  these  function  expressions  permit  the  reformulation  (expansion  as  well  as 
reduction)  of  a  nested  subexpression  independently  of  its  environment,  thus  modeling  what  is  known  as  stepwise 
program  refinement  as  well  as  program  execution.  Finally,  structured  programming  is  characterized  in  terms  of 
the  selection  and  solution  of  certain  elementary  equations  defined  in  the  algebra  of  functions.  These  solutions 
can  be  given  in  general  formulas,  each  involving  a  single  parameter,  which  display  the  entire  freedom  available 

[Mill7Sb]  Abstract:  Computer  software  production  costs  continue  to  increase-to  the  point  where  these  costs  are 
overwhelmingly  dominant  in  the  majority  of  computer  applications.  At  the  same  time,  there  is  an  increasing 
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sense  of  urgency  about  software  reliability,  which  must  be  achieved  without  significant  additional  software  cost 
increases.  Techniques  for  attaining  suitable  levels  of  implicit  software  quality  and  reliability  range  from  program¬ 
ming  and  managerial  disciplines  which  seek  to  instill  it  from  the  onset  of  a  software  system  development,  to  tech¬ 
niques  for  systematically  testing  (exercising)  software  systems.  Software  testing  is  a  viable  technique  for  com¬ 
pleted  (or  about-to-be-completed)  software  systems  because:  (1)  it  permits  full  “validation”  of  the  software  sys¬ 
tem,  (2)  it  can  approximate  a  formal  program  proof  of  correctness,  and  (3)  it  is  largely  automatable  and  relatively 
inexpensive. 

For  very  large  software  systems,  or  those  in  particularly  crucial  applications,  it  is  possible  to  reduce  the 
verification,  validation,  and  testing  cost  by  avoiding  certain  difficult-to-test  programming  constructs.  Some  of 
these  potentially  troublesome  forms  are  identified,  explanations  of  the  way  in  which  they  unnecessarily  increase 
software  testing  costs  are  given,  and  engineering  solutions  which  seek  to  avoid  these  difficulties  are  given. 

[MU175c]  Abstract:  Software  quality  enhancement  can  be  achieved  in  the  near  term  through  use  of  a  systematic 
program  testing  methodology.  The  methodology  attempts  to  relate  functional  software  testcases  with  formal 
software  specifications  as  a  means  to  achieve  correspondence  between  the  software  and  its  specification.  To  do 
this  requires  generation  of  appropriate  testcase  data. 

Automatic  testcase  generation  is  based  on  a  priori  knowledge  of  two  forms  of  internal  structure  informa¬ 
tion:  a  representation  of  the  tree  of  subschema  automatically  identified  from  within  each  program  text,  and  a 
representation  of  the  iteration  structure  of  each  subschema.  This  partition  of  a  large  program  allows  for  efficient 
and  effective  automatic  testcase  generation  using  straightforward  backtracking  techniques. 

During  backtracking  a  number  of  simplifying,  consolidating,  and  consistency  analyses  are  applied.  The 
result  is  either  (1)  early  recognition  of  the  impossibility  of  a  particular  program  flow,  or  (2)  efficient  generation  of 
input  variable  specifications  which  cause  the  testcase  to  traverse  each  portion  of  the  required  program  flow. 

A  number  of  machine  output  examples  of  the  backtracking  facility  are  given,  and  the  general  effectiveness 
of  the  entire  process  is  discussed,  in  creating  correct  structured  programs,  testing  cost  by  avoiding  certain  diffi- 
cult-to-test  programming  constructs. 

[Mill75d]  Abstract:  There  is  no  foolproof  way  to  ever  know  that  you  have  found  the  last  error  in  a  program.  So 
the  best  way  to  acquire  confidence  that  a  program  has  no  errors  is  never  to  find  the  first  one,  no  matter  how  much 
it  is  tested  and  used.  It  is  an  old  myth  that  programming  must  be  an  error-prone,  cut-and-try  process  of  frustra¬ 
tion  and  anxiety.  The  new  reality  is  that  you  can  learn  to  consistently  write  programs  which  are  error  free  in  their 
debugging  and  subsequent  use.  This  new  reality  is  founded  in  the  ideas  of  structured  programming  and  program 
correctness,  which  not  only  provide  a  systematic  approach  to  programming  but  also  motivate  a  high  degree  of 
concentration  and  precision  in  the  coding  subprocess. 

[Mill77a]  Abbreviated  Introduction:  The  problems  of  providing  quality  assurance  for  computer  software  have 
received  a  good  deal  of  attention  from  the  computing  community.  Such  areas  as  program  proving,  automatic  pro¬ 
gramming,  structured  programming,  and  hierarchical  design/development  methodologies  have  all  experienced 
significant  growth  -  largely  as  a  result  of  the  increased  attention  focused  on  them.  Program  testing,  on  the  other 
hand,  has  not  enjoyed  the  same  level  of  intensive  investigation,  even  though  it  has  a  number  of  technical  and 
intuitive  appeals. 

Both  art  and  theory  operate  in  program  testing  today.  The  “art”  of  program  testing  suggests  new  theoreti¬ 
cal  routes  which  drive  the  development  of  additional  "theory”  which,  in  turn,  drives  the  accumulation  of  further 
art. 

This  paper  describes  some  recent  efforts  to  build  a  bridge  linking  the  theory  of  program  testing  with  its 
practice.  Although  building  that  bridge  has  been  a  desirable  goal,  only  now  has  sufficient  research  insight  and 
actual  testing  experience  been  gained  to  even  begin  contemplating  the  form  this  practically  oriented  but  strongly 
founded  bridge  can  take. 

[Mill77b]  Abstract:  Automated  program  testing  tools  can  have  significant  utility  in  a  formal  program  testing  and 
quality  assurance  activity.  It  is  possible  to  characterize  automated  tools  by  the  degree  to  which  they  require 
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modification  of  the  source  programs,  and  by  the  level  of  automation  they  achieve.  Ten  categories  of  automated 
testing  tools  are  described  functionally  and  operationally.  Commercially  available  examples  of  each  class  of  tool 
are  given.  When  the  data  is  available,  indications  of  the  relative  effectiveness  of  the  tool  are  also  given. 

[Mill79a]  Abstract:  The  current  state  of  the  art  in  program  testing  technology  is  identified  in  terms  of  the  philo¬ 
sophical  underpinnings  of  software,  the  theoretical  foundations  of  the  field,  the  tools  and  techniques  that  can  be 
brought  to  bear  in  a  testing  activity,  the  methods  that  exist  for  planning  and  measuring  the  testing  activity,  and  the 
methods  of  management  and  control  that  exist. 

The  future  needs  for  program  testing  technology  are  identified  in  three  major  categories:  theoretical  foun¬ 
dations,  methodology,  and  automated  tools.  Over  twenty  needs  for  program  testing  targeted  for  the  1980s 
timeframe  are  identified  in  detail. 

[MU179b]  Abbreviated  Introduction:  It  has  been  said  that  one  of  the  biggest  problems  in  the  software  quality 
assurance  community  is  that  program  proving  techniques  appear  not  to  scale  up,  while  systematic  testing 
methods  don’t  appear  to  scale  down!  What’s  intended  here  is  to  observe  that  systematic  testing  methods  attempt 
to  deal  with  “large”  phenomena,  while  program  proving  techniques  deal  at  a  very  fine  level  of  detail.  Naturally 
enough,  the  outcomes  of  the  two  processes  will  be  different. 

The  objective  of  this  short  piece  is  to  present  some  statistics  derived  in  a  relatively  large-scale  systematic 
testing  activity  and  to  suggest  what  some  of  the  implications  of  those  “numbers”  might  be. 

[MU179c]  Abstract:  The  workshop’s  general  and  session  chairmen  offer  their  summaries  of  the  challenges  to  the 
software  testing  community  identified  at  the  December  meeting. 

[MU18Ia]  Abbreviated  Preface:  This  is  the  second  edition  of  Software  Testing  and  Validation  Techniques.  This 
edition  updates  and  amends  the  set  of  papers  included  in  the  prior  edition  that  was  first  organized  in  1978. 

Since  that  time,  there  have  been  several  advances  of  significance  in  the  software  testing  and  validation 
field;  a  number  of  new  papers  have  been  published,  and  in  many  ways  the  field  has  become  more  mature  and 
stable.  In  addition,  the  field  has  become  deeper  and  richer,  possibly  as  the  result  of  increased  emphasis  on 
software  quality  and  on  quality  assurance.  Each  year,  the  number  of  published  papers  of  significance  to  software 
testing  and  validation  has  increased,  as  has  the  number  of  researchers  actively  involved  in  the  field. 

The  papers  we  have  added  to  this  edition  fall  into  the  boundaries  we  have  previously  used  to  organize  the 
book:  Theoretical  Foundations,  Static  Analysis  -  Tools  and  Techniques,  Dynamic  Analysis  -  Tools  and  Tech¬ 
niques,  Effectiveness  Assessment,  Management  and  Planning,  and  Research  and  Development. 

[M11181b]  Abstract:  Comparing  the  usefulness  of  methodologies  for  software  development  can  be  especially  dif¬ 
ficult  when  the  services  offered  are  based  on  different  philosophies.  Two  systems,  AFFIRM  and  HDM,  were 
compared  for  their  application  to  operation  system  security  analysis.  The  assessment  technique  was  to  specify 
and  analyze  for  security  flaws  on  both  systems  a  miniature  example  of  a  security  kernel.  The  specification 
languages  are  at  the  opposite  poles  of  the  range  from  algebraic  axioms  to  transition  specifications.  The  types  of 
security  properties  that  could  be  verified  with  the  tools  available  were  access  policy  invariants  and  information 
flows.  One  theorem  prover  was  highly  interactive  and  the  other  nearly  automatic.  We  found  that  the  example 
could  be  specified  satisfactorily  and  recognizably  on  both  systems  with  a  comparable  amount  of  effort.  The  secu¬ 
rity  analyses,  on  the  other  hand,  led  to  very  different  verification  tasks  and  different  results.  The  two  results  were 
complementary  rather  than  contradictory,  and  some  additional  experimentation,  guided  by  theoretical  suspi¬ 
cions,  showed  the  exact  relationship  between  them. 

[MU184]  Abstract:  Writing  distributed  programs  is  difficult  for  at  least  two  reasons.  The  first  reason  is  that  distri¬ 
buted  computing  environments  present  new  problems  caused  by  asynchrony,  independent  time  bases,  and  com¬ 
munication  delays.  The  second  reason  is  that  there  is  a  lack  of  tools  available  to  help  the  programmer  understand 
the  program  he/she  has  written.  The  tools  we  use  for  single  machine  environments  do  not  easily  generalize  to  a 
distributed  environment.  There  has  been  only  limited  success  with  previous  systems  that  have  tried  to  help  the 
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programmer  in  developing,  debugging,  and  measuring  distributed  programs. 

To  better  understand  distributed  programs  we  have:  specified  a  model  for  distributed  computations, 
developed  a  measurement  methodology  from  this  model,  constructed  tools  to  implement  the  measurements,  and 
developed  data  analysis  techniques  to  obtain  useful  results  from  the  measurements.  The  most  important  feature 
of  the  models,  methodology,  and  tools  is  consistency  between  the  programmers  view,  the  computation  model, 
the  measurement  methodology  and  the  analysis. 

This  consistency  has  resulted  in  several  benefits.  The  first  is  simplicity  of  structure  throughout  the  meas¬ 
urement  and  analysis  tools.  The  second  benefit  is  the  ease  of  obtaining  useful  information  about  a  programs 
behavior. 

The  model  of  distributed  programs  defines  the  two  basic  actions  of  a  program  to  be  computation  and  com¬ 
munication.  Our  research  focuses  on  the  communications  performed  by  a  program.  The  measurement  model  is 
based  on  the  monitoring  of  communications  between  the  parts  of  a  program.  Given  our  definition  of  a  program, 
monitoring  communications  completely  encapsulates  the  behavior  of  a  computation.  From  the  measurement 
model,  we  have  constructed  tools  to  measure  distributed  programs  for  two  working  operating  systems,  UNIX 
and  DEMOS/MP.  These  measurement  tools  provide  data  on  the  interactions  between  the  parts  of  a  distributed 
program. 

We  have  developed  a  number  of  analysis  techniques  to  provide  information  from  the  data  collected.  We 
can  report  communications  statistics  on  message  counts,  queue  lengths,  and  message  waiting  times.  We  can  per¬ 
form  more  complex  analyses  such  as  measuring  the  amount  of  parallelism  in  the  execution  of  a  distributed  pro¬ 
gram.  The  analyses  also  include  detecting  paths  of  causality  through  the  parts  of  a  distributed  program.  The 
measurement  tools  and  analyses  can  be  constructed  so  that  the  results  can  be  fed  back  into  the  operating  system 
to  help  with  scheduling  decisions. 

[M11185]  Abstract:  A  new  method  for  estimating  the  present  failure  rate  of  a  program  is  presented.  A  crude  non- 
parametric  estimate  of  the  failure  rate  function  is  obtained  from  past  failure  times.  This  estimate  is  then 
smoothed  by  fitting  a  completely  monotonic  function,  which  is  the  solution  of  a  quadratic  programming  problem. 
The  value  of  the  smoothed  function  at  present  time  is  used  as  the  estimate  of  present  failure  rate.  A  Monte  Carlo 
study  gives  an  indication  of  how  well  this  method  works . 

[MU187a]  Introduction:  Recent  experience  demonstrates  that  software  can  be  engineered  under  statistical  qual¬ 
ity  control  and  that  certified  reliability  statistics  can  be  provided  with  delivered  software.  IBM’s  Cleanroom  pro¬ 
cess  has  uncovered  a  surprising  synergy  between  mathematical  verification  and  statistical  testing  of  software,  as 
well  as  a  major  difference  between  mathematical  fallibility  and  debugging  fallibility  in  people. 

With  the  Cleanroom  process,  you  can  engineer  software  under  statistical  quality  control.  As  with  clean- 
room  hardware  development,  the  process’s  first  priority  is  defect  prevention  rather  than  defect  removal  (of 
course,  any  defects  not  prevented  should  be  removed).  This  first  priority  is  achieved  by  using  human  mathemati¬ 
cal  verification  in  place  of  program  debugging  to  prepare  software  for  system  test. 

Its  next  priority  is  to  provide  valid,  statistical  certification  of  the  software’s  quality  through  representative- 
user  testing  at  the  system  level.  The  measure  of  quality  is  the  mean  time  to  failure  in  appropriate  units  of  time 
(real  or  processor  time)  of  the  delivered  product.  The  certification  takes  into  account  the  growth  of  reliability 
achieved  during  system  testing  before  delivery. 

To  gain  the  benefits  of  quality  control  during  development,  Cleanroom  software  engineering  requires  a 
development  cycle  of  concurrent  fabrication  and  certification  of  product  increments  that  accumulate  into  the 
system  to  be  delivered.  This  lets  the  fabrication  process  be  altered  on  the  basis  of  early  certification  results  to 
achieve  the  quality  desired. 

[Mill87b]  Abstract:  The  Interrogator  is  a  Prolog  program  that  searches  for  security  vulnerabilities  in  network 
protocols  for  automatic  cryptographic  key  distribution.  Given  a  formal  specification  of  the  protocol,  it  looks  for 
message  modification  attacks  that  defeat  the  protocol  objective.  It  is  still  under  development,  but  it  has  been  able 
to  rediscover  a  known  vulnerability  in  a  published  protocol.  It  is  implemented  in  LM-Prolog  on  a  Lisp  Machine, 
with  a  graphical  user  interface. 
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[Misr81]  Abstract:  We  present  a  proof  method  for  networks  of  processes  in  which  component  processes  com¬ 
municate  exclusively  through  messages.  We  show  how  to  construct  proofs  of  invariant  properties  which  hold  at 
all  times  during  network  computations,  and  terminal  properties  which  hold  upon  termination  of  network  compu¬ 
tations,  if  network  computation  terminates.  The  proof  method  is  based  upon  specifying  a  process  by  a  pair  of 
assertions,  analogous  to  pre-  and  post-conditions  in  sequential  program  proving.  The  correctness  of  network 
specification  is  proven  by  applying  inference  rules  to  the  specifications  of  component  processes.  Several  exam¬ 
ples  are  proved  using  this  technique. 

[Misr83]  Abstract:  Methods  proposed  for  software  reliability  prediction  are  reviewed.  A  case  study  is  then 
presented  of  the  analysis  of  failure  data  from  a  space  shuttle  software  project  to  predict  the  number  of  failures 
likely  during  a  mission,  and  the  subsequent  verification  of  the  predictions. 

[Mfya87]  Abstract:  The  Deviation-value  (D-value)  is  a  new  measure  for  software  data  involved  during  software 
development.  The  D-value  provides  an  alternative  to  software  metrics  based  upon  “per  number  of  lines  of  code” 
such  as  error  rate  (number  of  errors  per  thousand  lines  of  code)  and  documentation  rate  (number  of  pages  of 
module  design  documentation  per  thousand  lines  of  code).  Using  D-value,  the  data  of  software  modules  are 
much  more  fairly  evaluated  than  these  conventional  metrics. 

This  paper  presents  the  derivation  of  the  D-value  using  the  theoretical  background  of  a  control  chart 
called  u  chart  and  weighted  regression  analysis.  The  advantage  of  using  the  D-value  rather  than  metrics  based 
upon  “per  number  of  lines  of  code”  is  demonstrated  through  an  analysis  of  the  data  of  four  projects.  The 
D-value  is  used  to  find  the  data  items  which  actually  relate  to  software  quality,  and  we  find  that  the  quality  of  each 
module  measured  by  D-value  becomes  better  as  the  documentation  rate  D-value  increases.  Finally,  using  the 
theory  behind  the  D-value,  a  new  software  acceptance  guideline  is  discussed. 

[MtyaXX]  Abstract:  Effective  software  reliability  evaluation  requires  theories  of  software  reliability  which 
define  and  deal  with  software  reliability  quantitatively,  technologies  for  reliability  data  measurement  and  data 
analysis,  techniques  to  estimate  or  predict  software  reliability,  and  practical  reliability  evaluation  methodologies 
which  effectively  reflect  the  characteristics  of  software.  This  paper  addresses  the  extents  to  which  these  require¬ 
ments  are  currently  met,  and  introduces  improved  approaches  for  an  effective  software  reliability  evaluation. 
Introduced  are  the  methodologies  for  software  reliability  evaluation  and  the  software  reliability  evaluation-aid 
tools. 

[Mizu83]  Overview:  In  Japan,  people  are  the  key  to  software  quality  control.  At  NEC,  members  of  a  QC  team 
work  together  to  achieve  high  standards,  competing  with  other  teams  for  awards. 

[Moha79]  Abstract:  Several  software  quality  assessment  methods  which  span  the  software  life  cycle  are  dis¬ 
cussed.  The  quality  of  a  system  design  can  be  estimated  by  measuring  the  system  entropy  function  or  the  system 
work  function.  The  quality  improvement  due  to  reconfiguration  can  be  determined  by  calculating  system  entropy 
loading  measures.  Software  science  and  Zipf  s  law  are  shown  to  be  useful  for  estimating  program  length  and 
implementation  time.  Deterministic  and  statistical  methods  are  presented  for  predicting  the  number  of  errors. 
Testing  theory  is  useful  in  planning  the  program  test  process;  as  discussed  in  this  paper,  it  includes  measurement 
of  program  structural  characteristics  to  determine  test  effectiveness  and  test  planning.  Statistical  models  for 
estimating  software  reliability  are  also  discussed. 

[Mora75]  Abstract:  Estimates  of  future  performance  of  a  software  package  are  obtained  from  debugging  data  in 
essentially  two  ways.  In  one  way  the  record  in  time  of  the  occurrence  of  anomalies  is  used;  in  this  paper  three  dif¬ 
ferent  mathematical  models  of  failure  rates  are  described,  together  wi»h  illustrative  predictions  of  MTTF  and  of 
the  total  error  content  using  actual  trouble  report  data.  A  second  estimate  of  performance  of  a  program  is  by  its 
“operational  reliability”  which  is  obtained  through  variations  of  input  data  according  to  assumed  probability 
laws.  With  respect  to  this  procedure,  an  outline  is  given  of  the  goals  of  some  research  currently  being  done  at 
McDonnell  Douglas  Astronautics. 
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[Mora78a]  Introduction:  [The  author]  found  the  review  by  Dennis  Geller  of  the  Glenford  J.  Myers’  book, 
Software  Reliability:  Principles  and  Practices  (Computer,  October  1977,  pp.  117-118),  provided  excellent  coverage 
of  the  principal  theme  of  the  book,  that  being  software,  vis-a-vis  reliability.  While  [the  author  concurs]  with  Mr. 
Geller  on  all  of  the  points  which  he  makes,  both  favorable  and  unfavorable,  [the  authors]  own  reading  of  the 
book  focussed  on  the  reliability  aspects  of  the  material  presented. 

From  this  perspective  there  are  several  comments  on  the  book  which  [the  author  offers]  for  consideration. 
[The  author  feels]  these  comments  are  especially  timely,  since  the  Myers’  interpretation  of  reliability,  presented 
in  the  first  book  on  the  subject,  reinforces  the  erroneous  concept  that  reliability  “equates  to”  perfection. 

[Mora78c]  Introduction  to  Comment  on  “A  Review  and  Evaluation  of  Software  Science,”  by  A.  Fitzsimmons 
and  T.  Love:  This  article  raised  two  questions  in  my  mind:  whether  the  “effort”  measure  is  a  good  measure  of 
complexity;  and  whether  the  correlation  coefficient  is  a  reliable  tool  for  validating  the  conjectures  of  software 
science. 

[Mora80]  Abstract:  Two  variations  of  the  Jelinski/Moranda  model  are  described.  The  first  permits  estimation 
of  the  error  content  of  the  completed  software  package  using  data  which  is  taken  on  only  portions  of  the  pack¬ 
age.  That  model  is  applicable  when  the  eventual  size  of  the  program  is  known  at  the  outset. 

The  second  model  permits  a  similar  analysis  during  the  development  of  any  software  package  which  is 
homogeneous  with  respect  to  its  complexity  (error  making/finding). 

These  models  should  assist  analysts  in  the  determination  of  error  content  early  on.  They  should  also  elim¬ 
inate  the  present  practice  of  applying  models  to  the  wrong  regime  (decreasing  failure  rate  models  applied  to 
growing-in-size  software). 

[Morc87]  A  theory  of  fault-based  program  testing  is  defined  and  explained.  Testing  is  fault-based  when  it  seeks 
to  demonstrate  that  prescribed  faults  are  not  in  a  program.  It  is  assumed  that  a  program  can  only  be  incorrect  in 
a  limited  fashion  specified  by  associating  alternate  expressions  with  program  expressions.  Classes  of  alternate 
expressions  can  be  infinite.  Substitution  of  an  alternate  expression  for  a  program  expression  yields  an  alternate 
program  that  is  potentially  correct.  The  goal  of  fault-based  testing  is  to  produce  a  test  set  that  differentiates  the 
program  from  each  of  its  alternates.  A  particular  form  of  fault-based  testing  based  on  symbolic  execution  is 
presented.  In  symbolic  testing  program  expressions  are  replaced  by  symbolic  alternatives  that  represent  classes 
of  alternate  expressions.  The  output  from  the  system  is  an  expression  in  terms  of  the  input  and  the  symbolic 
alternative.  Equating  this  with  the  output  from  the  original  program  yields  a  propagation  equation  whose  solu¬ 
tions  determine  those  alternatives  which  are  not  differentiated  by  this  test. 

[More88]  Abstract:  Testing  is  fault-based  when  its  goal  is  to  demonstrate  the  absence  of  prespecified  faults.  This 
paper  presents  a  framework  that  characterizes  fault-based  testing  schemes  based  on  how  many  prespecified 
faults  are  considered  and  on  the  contextual  information  used  to  deduce  the  absence  of  those  faults.  Established 
methods  of  fault-based  testing  are  placed  within  this  framework.  Most  methods  either  are  limited  to  finite  fault 
classes,  or  focus  on  local  effects  of  faults  rather  than  global  effects.  A  new  method  of  fault-based  testing  called 
symbolic  testing  is  presented  by  which  infinitely  many  prespecified  faults  can  be  proven  to  be  absent  from  a  pro¬ 
gram  based  upon  the  global  effect  the  faults  would  have  if  they  were  present.  Circumstances  are  discussed  as  to 
when  testing  with  a  finite  test  set  is  sufficient  to  prove  that  infinitely  many  prespecified  faults  are  not  present  in  a 
program. 

[Morg86]  Abstract:  This  paper  focuses  on  a  reachability  graph  analyzer  (RGA),  a  tool  which  provides  mechan¬ 
isms  for  proving  general  system  properties  (e.g.,  deadlock-freeness)  as  well  as  system-specific  properties.  The 
tool  is  sufficiently  general  to  allow  a  user  to  apply  complex  user-defined  analysis  algorithms  to  reachability 
graphs.  The  alternating-bit  protocol  with  a  bounded  channel  is  used  to  demonstrate  the  power  of  the  tool  and  to 
point  to  future  extensions. 

[Morg87]  The  introduction  of  concurrency  into  programs  has  added  to  the  complexity  of  the  software  design 
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process.  This  is  most  evident  in  the  design  of  communications  protocols  where  concurrency  is  inherent  to  the 
behavior  of  the  system.  The  complexity  exhibited  by  such  software  systems  makes  more  evidence  the  need  for 
computer-aided  tools  for  automatically  analyzing  behavior. 

The  Distributed  Systems  project  at  UCI  has  been  developing  techniques  and  tools,  based  on  Petri  nets, 
which  support  the  design  and  evaluation  of  concurrent  software  systems.  Techniques  based  on  constructing 
reachability  graphs  that  represent  projections  and  selections  of  complete  state-spaces  have  been  developed.  This 
paper  focuses  attention  on  the  computer-aided  analysis  of  these  graphs  for  the  purpose  of  proving  correctness  of 
the  modeled  system.  The  application  of  the  analysis  technique  to  evaluating  simulation  results  for  correctness  is 
discussed.  The  tool  which  supports  this  analysis  (the  reachability  graph  analyzer,  RGA)  is  also  described.  This 
tool  provides  mechanisms  for  proving  general  system  properties  (e.g.,  deadlock-freeness)  as  well  as  system- 
specific  properties.  The  tool  is  sufficiently  general  to  allow  a  user  to  apply  complex  user-defined  analysis  algo¬ 
rithms  to  reachability  graphs.  The  alternating-bit  protocol,  with  a  bounded  channel,  is  used  to  demonstrate  the 
power  of  the  tool  and  to  point  to  future  extensions. 

[Mori83]  Abstract:  This  paper  describes  techniques  for  the  representation  and  refinement  of  visual  specifica¬ 
tions  in  the  context  of  PegaSys  (Programming  Environment  of  the  Graphical  Analysis  of  SYStems),  a  system 
that  supports  a  visual  paradigm  for  the  development  and  explanation  of  interactions  among  the  conceptual  enti¬ 
ties  in  a  system  design.  Pictures  have  a  computational  meaning  that  is  represented  in  a  formal  language,  called 
the  form  calculus.  The  form  calculus  is  extensible  in  that  it  contains  a  core  set  of  primitives  which  can  be  used  to 
build  a  variety  of  abstract  design  models.  Complexity  is  managed  by  means  of  picture  hierarchies,  whose  con¬ 
struction  is  guided  by  a  precise  refinement  methodology. 

The  representation  and  refinement  techniques  presented  here  have  been  implemented  and  all  reasoning  is 
fully  automatic  and  efficient.  Determining  the  validity  of  a  picture  refinement,  for  example,  involves  either  the 
application  of  a  simple  graph  algorithm  or  the  proof  of  a  formula  whose  predicates  range  over  small,  finite  sets. 
Excerpts  from  a  sample  session  with  PegaSys  are  used  to  illustrate  a  hierarchy  of  visual  specifications. 

[Morr71]  Abstract:  An  inductive  method  for  proving  things  about  recursively  defined  functions  is  described.  It 
has  shown  to  be  useful  for  proving  partial  functions  equivalent  and  thus  applicable  in  proofs  about  interpreters 
for  programming  languages. 

[Morr77]  Abstract:  A  new  proof  method,  subgoal  induction,  is  presented  as  an  alternative  or  supplement  to  the 
commonly  used  inductive  assertion  method.  Its  major  virtue  is  that  it  can  often  be  used  to  prove  a  loop’s  correct¬ 
ness  directly  from  its  input-output  specification  without  the  use  of  an  invariant.  The  relation  between  subgoal 
induction  and  other  commonly  used  induction  rules  is  explored  and,  in  particular,  it  is  shown  that  subgoal  induc¬ 
tion  can  be  viewed  as  a  specialized  form  of  computation  induction.  Finally,  a  set  of  sufficient  conditions  are 
presented  which  guarantee  that  an  input-output  specification  is  strong  enough  for  the  induction  step  of  a  proof  by 
subgoal  induction  to  be  valid. 

[Muno88]  Abstract:  An  approach  to  software  product  testing  is  presented.  The  approach  uses  the  following 
techniques:  automatic  test  case  generation,  self-checking  test  cases,  black  box  test  cases,  random  test  cases, 
sampling  a  form  of  exhaustive  testing,  correctness  measurements,  and  the  correction  of  defects  in  the  test  cases 
instead  of  in  the  product  (defect  circumvention).  The  techniques  have  been  cost  effective  and  applied  to  very 
large  products. 

[Muns89]  Abstract:  Software  complexity  metrics  attempt  to  define  the  unique  characteristics  of  computer  pro¬ 
grams  in  an  analytical  way.  Many  such  metrics  have  been  developed  to  explain  various  perceived  differences 
among  programs.  Many  studies  have  been  conducted  to  show  the  similarity  among  classes  of  these  metrics.  What 
is  lacking  in  this  body  of  literature  is  a  technique  which  will  aid  in  the  establishment  of  the  true  dimensionality  of 
the  complexity  problem  space. 

The  objective  of  this  paper  is  to  examine  some  recent  investigations  in  the  area  of  software  complexity 
using  factor  analysis  to  begin  an  exploration  of  the  actual  dimensionality  of  the  complexity  metrics.  This 
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technique  can  expose  the  relationships  of  these  many  metrics,  one  to  another.  Some  correlation  coefficients 
from  recent  empirical  studies  on  software  metrics  were  factor  analyzed,  showing  the  probable  existence  of  five 
complexity  dimensions  within  thirty  different  complexity  measures. 

[Mura89]  Abstract:  This  paper  presents  a  method  for  detecting  deadlocks  in  Ada  tasking  programs  using  struc¬ 
tural  and  dynamic  analysis  of  Petri  nets.  Algorithmic  translation  of  the  Ada  programs  into  Petri  nets  that 
preserve  control  flow  and  message  flow  properties  is  described.  Properties  of  these  Petri  nets  are  discussed,  and 
algorithms  are  given  to  analyze  the  nets  to  obtain  information  about  static  deadlocks  that  can  occur  in  the  origi¬ 
nal  programs.  Petri  net  invariants  are  used  by  the  algorithms  to  reduce  the  time  and  space  complexities  associ¬ 
ated  with  dynamic  Petri  net  analysis  (i.e.,  reachability  graph  generation). 

[Musa75]  Abstract:  An  approach  to  a  theory  of  software  reliability  based  on  execution  time  is  derived.  This 
approach  provides  a  model  that  is  simple,  intuitively  appealing,  and  immediately  useful. 

The  theory  permits  the  estimation,  in  advance  of  a  project,  of  the  amount  of  testing  in  terms  of  execution 
time  required  to  achieve  a  specified  reliability  goal  [stated  as  a  mean  time  to  failure  (MTTF)].  Execution  time  can 
then  be  related  to  calendar  time,  permitting  a  schedule  to  be  developed.  Estimates  of  execution  time  and  calen¬ 
dar  time  remaining  until  the  reliability  goal  is  attained  can  be  continually  remade  as  testing  proceeds,  based  only 
on  the  length  of  the  execution  time  intervals  between  failures.  The  current  MTTF  and  the  number  of  errors 
remaining  can  also  be  estimated.  Maximum  likelihood  estimation  is  employed,  and  confidence  intervals  are  also 
employed.  The  foregoing  information  is  obviously  very  valuable  in  scheduling  and  monitoring  the  progress  of 
program  testing.  A  program  has  been  implemented  to  compute  the  foregoing  quantities. 

The  reliability  model  that  has  been  developed  can  be  used  in  making  system  tradeoffs  involving  software 
or  software  and  hardware  components.  It  also  provides  a  soundly  based  unit  of  measure  for  the  comparative 
evaluation  of  various  programming  techniques  that  are  expected  to  enhance  reliability. 

The  model  has  been  applied  to  four  medium-sized  software  development  projects,  all  of  which  have  com¬ 
pleted  their  life  cycles.  Measurements  taken  of  MTTF  during  operation  agree  well  with  the  predictions  made  at 
the  end  of  system  test.  As  far  as  the  author  can  determine,  these  are  the  first  times  that  a  software  reliability 
model  has  been  used  during  software  development  projects.  The  paper  reflects  and  incorporates  the  practical 
experience  gained. 

[Musa79a]  Abstract:  This  paper  investigates  the  validity  of  the  execution-time  theory  of  software  reliability.  The 
theory  is  outlined,  along  with  appropriate  background,  definitions,  assumptions,  and  mathematical  relation¬ 
ships.  Both  the  execution  time  and  calendar  time  components  are  described.  The  important  assumptions  are  dis¬ 
cussed.  Actual  data  are  used  to  test  the  validity  of  most  of  the  assumptions.  Model  and  actual  behavior  are  com¬ 
pared.  The  development  projects  and  operational  computation  center  software  from  which  the  data  have  been 
obtained  are  characterized  to  give  the  reader  some  basis  for  judging  the  breadth  of  applicability  of  the  concepts. 

[Mu8a79b]  Introduction:  Boehm,  Brown,  and  Lipow  have  characterized  the  multi-dimensional  nature  of 
software  quality  in  terms  of  a  hierarchy  of  attributes.  One  of  the  high-level  attributes  is  reliability,  which  they 
define  qualitatively  as  the  satisfactory  performance  of  intended  functions.  This  definition  may  be  refined  to  the 
quantitative  statement  “probability  of  failure-free  operation  in  a  specified  environment  for  a  specified  time.”  A 
“failure”  is  an  unacceptable  departure  of  program  operation  from  program  requirements,  where,  as  in  the  case 
of  hardware,  “unacceptable”  must  ultimately  be  defined  by  the  user.  The  term  “fault”  will  be  used  to  indicate  the 
program  defect  that  causes  the  failure.  Several  trends  have  recently  combined  to  escalate  the  importance  of 
quantitative  software  reliability  measures: 

1.  The  large  and  growing  number  of  real-time  and  interactive  systems  has  increased  the  operational  and  cost 
impact  of  failure. 

2.  The  increasing  number,  size,  and  complexity  of  computer  networks  and  distributed  processing  systems  have 
multiplied  the  risk  and  effects  of  failure. 

3.  The  explosive  growth  of  personal  computing  has  created  a  demand  for  relatively  foolproof  software  for  unso¬ 
phisticated  users. 
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Measurement  is  seen  to  be  important  as  soon  as  one  recognizes  that  in  software  as  in  hardware  there  can 
be  too  much  as  well  as  too  little  reliability.  Improvement  of  reliability,  of  course,  costs  money,  and  usually 
impacts  development  schedules  and  system  performance  (in  the  case  of  software,  through  increased  memory, 
processing  time,  and  peripherals  requirements).  The  system  engineer  and  the  manager  have  to  make  design 
tradeoffs  among  the  foregoing  factors  and  it  is  best  that  this  be  done  in  quantitative  terms.  The  need  for  a  quanti¬ 
tative  reliability  measure  continues  throughout  the  development  process,  particularly  during  test,  since  reliability 
is  a  valuable  indicator  of  system  status.  Finally,  reliability  or  mean-time-to-failure  (MTTF)  is  a  useful  metric  for 
characterizing  system  operation  and  for  controlling  change  during  the  maintenance  phase.  This  paper  will  focus 
on  the  system  engineering  application,  but  it  will  also  touch  on  monitoring  the  system  test  phase  and  controlling 
change  during  maintenance. 

[Musa80b]  Abstract:  The  theme  of  this  paper  is  the  held  of  software  reliability  measurement  and  its  application. 
Needs  for  and  potential  uses  of  software  reliability  measurement  are  discussed.  Software  reliability  and  hardware 
reliability  are  compared,  and  some  basic  software  reliability  concepts  are  outlined.  A  brief  summary  of  the  major 
steps  in  the  history  and  evolution  of  the  field  is  presented.  Two  of  the  leading  software  reliability  models  are 
described  in  some  detail.  The  topics  of  combinations  of  software  (and  hardware)  components  and  availability  are 
discussed  briefly.  The  paper  concludes  with  an  analysis  of  the  current  state-of-the-art  and  a  description  of  further 
research  needs. 

[Musa84]  Abstract:  A  new  software  reliability  model  is  developed  that  predicts  expected  failures  (and  hence 
related  reliability  quantities)  as  well  or  better  than  existing  software  reliability  models,  and  is  simpler  than  any  of 
the  models  that  approach  it  in  predictive  validity.  The  model  incorporates  both  execution  time  and  calendar  time 
components,  each  of  which  is  derived.  The  model  is  evaluated  using  actual  models. 

[Musa87]  Table  of  Contents:  Introduction  to  software  reliability,  selected  models,  applications.  Practical  Appli¬ 
cation,  system  definition,  parameter  determination,  project-specific  techniques,  application  procedures,  imple¬ 
mentation  planning.  Theory,  software  reliability  modeling,  markovian  models,  description  of  specific  models, 
parameter  estimation,  comparison  of  software  reliability  models,  calendar  time  modeling,  failure  time  adjust¬ 
ment  for  evolving  programs. 

[Musa89]  Abbreviated  Introduction:  How  do  you  validate  that  a  piece  of  software  loaded  into  a  processor  func¬ 
tions  correctly?  One  traditional  answer  is  that  you  subject  it  to  a  rigorous  system  test.  But  there  is  a  fundamental 
problem:  For  any  but  the  most  trivial  application,  the  number  of  distinct  input  combinations  you  would  need  to 
verify  is  enormous  -  orders  and  orders  of  magnitude  larger  than  any  number  that  can  be  tested  exhaustively. 

Furthermore,  because  of  the  discrete  nature  of  computer  memory  and  processing,  the  difference  of  a  sin¬ 
gle  input  bit  out  of  thousands  may  be  all  that  separates  an  input  combination  that  runs  successfully  from  one  that 
doesn’t. 

How  then  do  you  validate  software?  In  hard  engineering  terms,  the  answer  is  that  up  to  now  you  really 
haven’t.  There  is  a  lot  of  lore  about  system  testing,  but  it  all  boils  down  to  guesswork.  That  is,  it  is  guesswork 
unless  you  can  structure  the  problem  and  perform  the  testing  so  that  you  can  apply  mathematical  statistics. 

If  you  can  do  this,  you  can  say  something  like  “No,  we  cannot  be  absolutely  certain  that  the  software  will 
never  fail,  but  relative  to  a  theoretically  sound  and  experimentally  validated  statistical  model,  we  have  done  suffi¬ 
cient  testing  to  say  with  95-percent  confidence  that  the  probability  of  1,000  CPU  hours  of  failure-free  operation  in 
a  probabilistically  defined  environment  is  at  least  0.995.” 

When  you  do  this,  you  are  applying  software -reliability  measurement.  In  this  situation,  this  is  the  best  you 
can  do.  For  purists,  this  may  not  be  a  satisfactory  answer  to  our  initial  question.  But  with  software-reliability 
measurement,  you  do  not  deal  explicitly  with  the  vastness,  discreteness,  and  discontinuity  of  a  program  input 
space  -  you  sidestep  these  imponderables  by  using  statistics  to  provide  concrete,  quantitative  guidance. 

In  this  article,  we  define  the  basic  concepts  of  software-reliability  measurement  and  show  you  how  to  use 
them  in  software  validation. 
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[Muss79]  Abstract:  This  paper  describes  the  data  type  definition  facilities  of  the  AFFIRM  system  for  program 
specification  and  verification.  Following  an  overview  of  the  system,  we  review  the  rewrite  rule  concepts  that  form 
the  theoretical  basis  for  its  data  type  facilities.  The  main  emphasis  is  on  methods  of  ensuring  convergence  (finite 
and  unique  termination)  of  sets  of  rewrite  rules  and  on  the  relation  of  this  property  to  the  equational  and  induc¬ 
tive  proof  theories  of  data  types. 

[Myer77]  Abstract:  A  recent  paper  has  described  a  graph-theoretic  measure  of  program  complexity,  where  a 
program’s  complexity  is  assumed  to  be  only  a  factor  of  the  program’s  decision  structure.  However,  several 
anomalies  have  been  found  where  a  higher  complexity  measure  would  be  calculated  for  a  program  of  lesser  com¬ 
plexity  than  for  a  more-complex  program.  This  paper  discusses  these  anomalies,  describes  a  simple  extension  to 
the  measure  to  eliminate  them,  and  applies  the  measure  to  several  programs  in  the  literature. 

[Myer78a]  Abstract:  This  paper  describes  an  experiment  in  program  testing,  employing  59  highly  experienced 
data  processing  professionals  using  seven  methods  to  test  a  small  PL/1  program.  The  results  show  that  the  popu¬ 
lar  code  walkthrough/inspection  method  was  as  effective  as  other  computer-based  methods  in  finding  errors  and 
that  the  most  effective  methods  (in  terms  of  errors  found  and  cost)  employed  pairs  of  subjects  who  tested  the 
program  independently  and  then  pooled  their  findings.  The  study  also  shows  that  there  is  a  tremendous  amount 
of  variability  among  subjects  and  that  the  ability  to  detect  certain  types  of  errors  varies  from  method  to  method. 

[Myer78b]  Introduction:  Moranda’s  remarks  in  The  Open  Channel  in  Computer,  April  1978,  on  my  book, 
Software  Reliability:  Principles  and  Practices,  fall  into  two  general  areas.  First,  he  feels' that  the  book  “is  not 
about  software  reliability  as  it  has  come  to  be  defined.”  Second,  he  seems  defensive  about  my  “low  opinion”  (his 
words)  of  probability-based  models,  particularly  his  model. 

[My«r79]  Tabic  of  Contents:  A  Self-Assessment  Test.  The  Psychology  and  Economics  of  Program  Testing.  Pro¬ 
gram  Inspections,  Walkthroughs,  and  Reviews.  Test  Case  Design.  Module  Testing.  Higher-Order  Testing. 
Debugging.  Test  Tools  and  Other  Techniques. 

[Myer83]  Abstract:  Many  modem  computer  languages  allow  the  programmer  to  define  and  use  a  variety  of  data 
types.  Few  programming  systems,  however,  allow  the  programmer  similar  flexibility  when  displaying  the  data 
structures  for  debugging,  monitoring  and  documenting  programs.  Incense  is  a  working  prototype  system  that 
allows  the  programmer  to  interactively  investigate  data  structures  in  actual  programs.  The  desired  displays  can 
be  specified  by  the  programmer  or  a  default  can  be  used.  The  default  displays  provided  by  Incense  present  the 
standard  form  for  literals  of  the  basic  types,  the  actual  names  for  scalar  types,  stacked  boxes  for  records  and 
arrays,  and  curved  lines  with  arrowheads  for  pointers.  In  addition  to  displaying  data  structures.  Incense  also 
allows  the  user  to  select,  move,  erase  and  redimension  the  resulting  displays.  These  interactions  are  provided  in 
a  uniform,  natural  manner  using  a  pointer  device  {mouse)  and  keyboard. 

[Myhr68]  Abstract:  Some  specific  comparisons  are  made  in  this  note  between  the  use  of  the  asymptotic  Chi- 
square  distribution  of  the  likelihood  ratio  and  the  asymptotic  normality  of  the  maximum  likelihood  estimates  to 
obtain  confidence  intervals  for  reliabilities  of  arbitrary  systems  when  only  failure  data  on  the  components  is 
known.  In  all  the  comparisons  made,  using  moderate  samples  and  systems  of  average  complexity,  the  asymptotic 
Chi-square  appears  to  give  much  more  accurate  confidence  intervals.  Although  the  asymptotic  Chi-square 
method  requires  more  computation  for  most  systems  than  does  the  method  based  on  asymptotic  normality,  these 
examples  indicate  the  Chi-square  method  would  yield  superior  results  in  most  practical  instances. 

[NBS82a]  Abstract:  Thirty  techniques  and  tools  for  validation,  verification,  and  testing  (V,V&T)  are  described. 
Each  description  includes  the  basic  features  of  the  technique  or  tool,  the  input,  the  output,  an  example,  an 
assessment  of  the  effectiveness  and  usability,  applicability,  an  estimate  of  the  learning  time  and  training,  an  esti¬ 
mate  of  needed  resources,  and  references. 
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[NBS82b]  Abstract:  Thirty  techniques  and  tools  for  validation,  verification,  and  testing  (V,V&T)  are  described. 
Each  description  includes  the  basic  features  of  the  technique  or  tool,  the  input,  the  output,  an  example,  an 
assessment  of  the  effectiveness  and  usability,  applicability,  an  estimate  of  the  learning  time  and  training,  an  esti¬ 
mate  of  needed  resources,  and  references. 

[Nage84]  Abbreviated  Summary:  This  report  documents  the  second  of  two  studies  performed  by  Boeing  Com¬ 
puter  Services  on  modeling  the  process  of  software  error  detection  from  the  results  of  experiments  specifically 
designed  to  complement  this  activity.  The  experiments  consist  of  simulations  conducted  on  code  prepared  under 
controlled  conditions  and  executed  with  randomly  selected  inputs.  Six  codes  were  developed  in  the  first  study 
and  this  study  continues  the  experiment  with  six  more.  The  code  is  initialized  to  an  original  state  and  flexed  with 
independently  generated  random  inputs.  Errors  are  corrected  as  they  are  encountered  until  a  stopping  rule  is 
satisfied.  Replicatr  m  is  introduced  by  repeating  the  entire  process  from  initialization. 

The  w«*vS'  js  study  explored  the  effects  of  programmer  and  problem  as  experimental  design  factors  on  the 
env  .w.vj-e.  The  current  study  enlarges  this  set  of  factors  by  varying  the  experience  level  of  the  programmer 
and  the  relative  frequency  or  usage  of  the  program  units.  The  use  of  FORTRAN  is  contrasted  with  the  use  of  a 
micro-based  assembler  language  as  another  design  factor.  All  of  these  factors,  not  surprisingly,  affected  perfor¬ 
mance  and  some  very  tentative  relational  hypotheses  are  suggested. 

An  analytic  framework  for  replicated  and  non-replicated  (i.e.,  traditional)  software  experiments  is  ini¬ 
tiated  in  this  study  in  order  to  present  the  results  in  a  meaningful  context.  A  method  of  obtaining  an  upper  bound 
on  the  error  rate  of  the  next  error  is  proposed.  Two  other  forecasting  methods  are  proposed.  One  based  on  a 
crude  approximation  to  the  proportional  hazards  model.  The  other  subtracted  the  observed  error  probability 
and  the  program’s  success  rate  from  one  to  estimate  the  remaining  error  rate. 

[Naka89]  Abstract:  Many  simple  software  errors  are  found  in  earlier  software  test  phases.  The  ratio  of  complex 
errors  to  simple  errors  gradually  increases  with  continual  testing.  This  paper  describes  a  software  reliability 
model  called  the  Error  Complexity  Model.  In  this  model,  errors  are  classified  by  error  complexity  which  is  a 
measure  of  error  detectability.  The  number  of  remaining  software  errors  is  estimated  from  the  ratio  of  complex 
to  simple  errors  and  the  number  of  discovered  errors.  New  criteria  for  error  complexity  classification  are  pro¬ 
posed.  The  model  is  evaluated  and  compared  with  existing  models  using  actual  error  data. 

[Naur69]  Abstract:  The  paper  describes  a  programming  discipline,  aiming  at  the  systematic  construction  of  pro¬ 
grams  from  given  global  requirements.  The  crucial  step  in  the  approach  is  the  conversion  of  the  global  require¬ 
ments  into  sets  of  action  clusters  (sequences  of  program  statements),  which  are  then  used  as  building  blocks  for 
the  final  program.  The  relation  of  the  approach  to  proof  techniques  and  to  programming  languages  is  discussed 
briefly. 

[Nels78]  Abstract:  Recent  work  on  software  reliability  associates  correct  execution  of  a  test  case  with  a  statisti¬ 
cal  inference  that  the  program  will  execute  correctly  for  a  specified  subset  of  inputs.  Test  cases  can  be  designed 
so  that  their  associated  subsets  cover  the  entire  input  domain,  allowing  reliability  estimates  to  be  made  for 
expected  operational  use  profiles. 

[Ng78]  Summary:  This  paper  reports  a  FORTRAN  post  mortem  dump  system  (PMD)  for  the  ICL  1900  comput¬ 
ers.  The  system,  jointly  implemented  by  Birmingham  and  Liverpool  Universities,  can  perform  a  core/storage 
dump  in  terms  of  the  original  FORTRAN  source  following  the  segment  (subroutines,  etc.)  history  of  execution 
when  the  program  fails  to  terminate  successfully.  The  compilation  overheads  of  the  new  system  are  very  low  and 
the  execution  overheads  practically  none. 

[Nico87]  Abstract:  In  this  paper  we  consider  the  queueing  analysis  of  a  fault-tolerant  computer  system.  The 
failure/repair  behavior  of  the  server  is  modeled  by  an  irreducible  continuous-time  Markov  chain.  Jobs  arrive  in  a 
Poisson  fashion  to  the  system  and  are  serviced  according  to  FCFS  discipline.  A  failure  may  cause  the  loss  of  the 
work  already  done  on  the  job  in  service,  if  any;  in  this  case  the  interrupted  job  is  repeated  as  soon  as  the  server  is 
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ready  to  deliver  service.  In  addition  to  the  delays  due  to  failures  and  repairs,  jobs  suffer  delays  due  to  queuing. 
We  present  an  exact  queueing  analysis  of  the  system  and  study  the  steady  state  behavior  of  the  number  of  jobs  in 
the  system.  As  a  numerical  example,  we  consider  a  system  with  two  processors  subject  to  failures  and  repairs. 

[Nood75]  Abstract:  In  the  author’s  view  structured  programming  consists  of  the  use  of  the  following:  structure, 
abstraction,  and  specification.  The  purpose  of  this  paper  is  to  develop  formal  specifications  for  a  nontrivial  pro¬ 
gram  in  order  to  facilitate  a  proof  of  correctness.  It  is  shown  how  the  specifications  serve  as  an  abstraction  for 
the  program.  A  proof  of  correctness  then  consists  of  merely  showing  that  the  program  at  each  level  meets  its  for¬ 
mal  specifications.  Under  this  methodology  lower  levels  of  the  program  can  be  changed  without  affecting  higher 
levels. 

[Ntaf79]  Abstract:  In  this  paper  various  path  cover  problems,  arising  in  program  testing,  are  discussed.  Dil- 
worth’s  theorem  for  acyclic  digraphs  is  generalized.  Two  methods  for  finding  a  minimum  set  of  paths  (minimum 
path  cover)  that  covers  the  vertices  (or  the  edges)  of  a  digraph  are  given.  To  model  interactions  among  code  seg¬ 
ments,  the  notions  of  required  pairs  and  required  paths  are  introduced.  It  is  shown  that  finding  a  minimum  path 
cover  for  a  set  of  required  pairs  is  NP-hard.  An  efficient  algorithm  is  given  for  finding  a  minimum  path  cover  for 
a  set  of  required  paths.  Other  constrained  path  problems  are  considered  and  their  complexities  are  discussed. 

[Ntaffila]  Abstract:  In  this  paper,  we  introduce  required  element  testing  and  report  on  an  experimental  com¬ 
parison  of  this  strategy  with  branch  and  random  testing.  The  required  element  testing  strategy  studied  here  uses 
data  flow  information  to  generate  a  set  of  required  elements  for  a  program.  The  comparison  with  branch  and  ran¬ 
dom  testing  is  performed  using  mutation  analysis  as  a  measure  of  test  set  adequacy. 

[Ntaffilb]  Abstract:  Certain  graph  theoretic  problems  dealing  with  the  testing  of  structured  programs  are 
treated.  A  structured  digraph  is  a  digraph  that  represents  a  structured  program.  A  labeling  procedure  which 
characterizes  structured  digraphs  is  described.  An  efficient  algorithm  for  finding  a  minimum  path  cover  for  the 
vertices  of  digraphs  that  belong  to  an  important  family  of  structured  digraphs  is  given.  To  model  interactions 
among  code  segments  the  notions  of  “required  pairs”  and  “must  pairs”  are  introduced  and  the  corresponding 
constrained  path  cover  problems  are  shown  to  be  NP-complete  even  for  acyclic  structured  digraphs. 

[Ntaf82]  Abstract:  Two  classes  of  program  testing  strategies  are  introduced  that  consist  of  specifying  a  set  of 
required  elements  for  the  program  and  then  covering  those  elements  with  appropriate  test  inputs.  In  general,  a 
required  element  has  a  structural  and  a  functional  component  and  is  covered  by  a  test  case  if  the  test  case  causes 
the  features  specified  in  the  structural  component  to  be  executed  under  the  conditions  specified  in  the  functional 
component.  Data  flow  analysis  is  used  to  specify  the  structural  component,  and  data  flow  interactions  are  used  as 
a  basis  for  developing  the  functional  component.  The  strategies  are  illustrated  with  examples  and  some  experi¬ 
mental  evaluations  of  their  effectiveness  are  presented. 

[NtaffiS]  Abstract:  In  this  paper  we  compare  a  number  of  structural  testing  strategies  in  terms  of  their  relative 
coverage  of  the  program’s  structure  and  also  in  terms  of  the  number  of  test  cases  needed  to  satisfy  each  strategy. 
We  also  discuss  some  of  the  deficiencies  of  such  comparisons. 

[Offb87]  Abstract:  Mutation  analysis  is  a  powerful  technique  for  testing  software  systems.  In  the  Mothra  pro¬ 
ject,  conducted  at  Georgia  Tech’s  Software  Engineering  Research  Center,  mutation  analysis  is  used  as  a  basis  for 
building  an  integrated  software  testing  environment.  Mutation  analysis  requires  the  execution  of  many  slightly 
differing  versions  of  the  same  program  to  evaluate  the  quality  of  the  data  used  to  test  the  program.  In  the  current 
version  of  the  Mothra  system,  a  program  to  be  tested  is  translated  to  intermediate  code,  where  it  and  its  mutated 
versions  are  executed  by  an  interpreter. 

In  this  paper,  we  discuss  some  of  the  unique  requirements  of  an  interpreter  used  in  a  mutation-based  test¬ 
ing  environment.  We  then  describe  how  these  requirements  affected  the  design  and  implementation  of  the  For¬ 
tran  77  version  of  the  Mothra  interpreter.  Other  topics  covered  include  the  architecture  of  the  interpreter  and 
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many  of  the  design  elements  that  it  incorporates.  We  also  describe  the  intermediate  language  used  by  Mothra  and 
the  features  of  the  interpreter  that  are  needed  for  software  testing. 

[Ohba84]  Abstract:  This  paper  discusses  improvements  to  conventional  software  reliability  analysis  models  by 
making  the  assumptions  on  which  they  are  based  more  realistic.  In  an  actual  project  environment,  sometimes  no 
more  information  is  available  than  reliability  data  obtained  from  a  test  report.  The  models  described  here  are 
designed  to  resolve  the  problems  caused  by  this  constraint  on  the  availability  of  reliability  data.  By  utilizing  the 
technical  knowledge  about  a  program,  a  test,  and  test  data,  we  can  select  an  appropriate  software  reliability 
analysis  model  for  accurate  quality  assessment.  The  delayed  S-shaped  growth  model,  the  inflection  S-shaped 
model,  and  the  hyperexponential  model  are  proposed. 

[Ohba89]  Abstract:  This  paper  discusses  the  improvement  of  conventional  software  reliability  growth  models 
by  elimination  of  the  unreasonable  assumption  that  errors  or  faults  in  a  program  can  be  perfectly  removed  when 
they  are  detected.  The  results  show  that  exponential-type  software  reliability  growth  models  that  deal  with  error- 
counting  data  could  be  used  even  if  the  perfect  debugging  assumption  were  not  held,  in  which  case  the  interpreta¬ 
tion  of  the  model  parameters  should  be  changed.  An  analysis  of  real  project  data  is  presented. 

[Okad82]  Abstract:  In  order  to  obtain  a  usable  effort  estimation  for  a  small  scale  project,  an  earlier  phase  of 
CAD/CAM  system  software  development  was  carefully  studied.  Upon  analysis  of  data  obtained,  three  addi¬ 
tional  attributes  other  than  the  number  of  source  statements  were  taken  into  account  as  a  basis  for  the  effort  esti¬ 
mation.  They  were  1)  complexity,  2)  personnel  skill  and  3)  specification  volatility.  Subsequently,  a  set  of  models 
for  the  small  project  effort  estimation  was  derived.  Program  size-effort  correlations  obtained  were  higher  than 
0.972.  The  models  were  then  applied  to  the  consecutive  phases  of  the  same  project.  The  fitness  between  the 
estimated  effort  and  the  actual  effort  was  satisfactory  in  a  practical  sense. 

[01de77]  Abstract:  Software  science  techniques  have  been  used  to  provide  a  framework  for  evaluation  of  prob¬ 
lem  solving  systems.  In  that  effort,  two  methods  for  calculating  the  level  of  a  language  (L  and  L)  were  used;  it 
was  suspected  that  L,  while  adequate  in  that  application,  might  be  inferior  to  L.  By  using  a  set  of  hypothetical 
languages,  each  with  different  intrinsic  data  structures  and  operators,  it  is  shown  here  that  when  an  inappropriate 
language  is  applied  to  some  problems,  L  may  reflect  an  inaccurately  large  value  for  language  level,  and  can  some¬ 
times  be  made  to  yield  an  arbitrary  value.  Since  L  is  often  as  easily  applied  as  L,  and  does  not  exhibit  this 
anomalous  behavior,  it  is  suggested  that  its  general  use  is  to  be  preferred. 

[Olde83]  Abstract:  This  paper  describes  a  technique  for  predicting  the  execution  behavior  of  a  source  program 
or  a  software  design  specification.  As  a  by-product  of  syntactic  analysis,  a  program  graph  is  constructed  which 
can  subsequently  be  treated  as  the  graph  of  a  finite  automaton.  The  expression  for  execution  behavior  is  the  regu¬ 
lar  expression  of  the  graph.  Several  simplification  techniques  for  these  expressions  are  discussed  and  exempli¬ 
fied.  In  particular,  the  substitution  of  known  values  for  program  segments  followed  by  constant  folding  cannot  be 
done  indiscriminately;  the  allowable  situations  are  characterized.  Applications  include  the  prediction  of  execu¬ 
tion  time  for  a  program  or  a  software  design,  other  forms  of  language  analysis,  and  program  restructuring. 

[Olen86]  Abstract:  This  paper  presents  a  flexible  and  general  mechanism  for  specifying  problems  relating  to  the 
sequencing  of  events  and  mechanically  translating  them  into  dataflow  analysis  algorithms  capable  of  solving 
those  problems.  Dataflow  analysis  has  been  used  for  quite  some  time  in  compiler  code  optimization.  It  has 
recently  gained  increasing  attention  as  a  way  of  statically  checking  for  the  presence  or  absence  of  errors  and  as  a 
way  of  guiding  the  test  selection  process.  Most  static  analyzers,  however,  have  been  custom-built  to  search  for 
the  fixed,  and  often  quite  limited,  classes  of  dataflow  conditions.  We  show  that  the  range  of  sequences  for  which 
it  is  interesting  and  worthwhile  to  search  is  actually  quite  broad  and  diverse.  We  create  a  formalism  for  specifying 
this  diversity  of  conditions.  We  then  show  that  these  conditions  can  be  modeled  essentially  as  dataflow  analysis 
problems  for  which  effective  solution  are  known  and  further  show  how  these  solution  can  be  exploited  to  serve  as 
the  basis  for  mechanical  creation  of  analyzers  for  these  conditions. 
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[Oate76a]  Summary:  This  paper  describes  DAVE,  a  system  for  analyzing  Fortran  programs.  DAVE  is  capable  of 
detecting  the  symptoms  of  a  wide  variety  of  errors  in  programs,  as  well  as  assuring  the  absence  of  these  errors.  In 
addition,  DAVE  exposes  and  documents  subtle  data  relations  and  flows  within  programs.  The  central  analytic 
procedure  used  is  a  depth  first  search.  DAVE  itself  is  written  in  Fortran.  Its  implementation  at  the  University  of 
Colorado  and  some  early  experience  are  described. 

[Oste76b]  Abstract:  This  paper  describes  DAVE,  an  automatic  program  testing  aid  which  performs  a  static 
analysis  of  Fortran  programs.  DAVE  analyzes  the  data  flows  both  within  and  across  subprogram  boundaries  of 
Fortran  programs,  and  is  able  to  detect  occurrences  of  uninitialized  and  dead  variables  in  such  programs.  The 
paper  shows  how  this  capability  facilitates  the  detection  of  a  wide  variety  of  errors,  many  of  which  are  often  quite 
subtle.  The  central  analytic  mechanism  in  DAVE  is  a  depth-first  search  procedure  which  enables  DAVE  to  exe¬ 
cute  efficiently.  Some  experiences  with  DAVE  are  described  and  evaluated  and  some  future  work  is  projected. 

[Oste77]  Abstract:  An  unfortunate  characteristic  of  current  static  analysis  algorithms  is  their  apparent  inability 
to  distinguish  between  executable  and  unexecutable  program  paths.  The  definitive  determination  of  executability 
of  a  given  path  has  long  been  known  to  be  unachievable.  This  paper  presents  some  heuristics  for  detecting  cer¬ 
tain  classes  of  unexecutable  paths  and  preliminary  findings  tending  to  indicate  that  the  heuristics  can  be  expected 
to  be  rather  effective.  The  heuristics  are  based  upon  the  application  of  existing  static  data  flow  analysis  algo¬ 
rithms  and  hence  offer  hope  of  coexisting  with  and  guiding  diagnostic  and  optimization  scans  which  also  use  data 
flow  analysis. 

[OstcSO]  Abstract:  This  paper  presents  an  approach  to  integrating  four  techniques  for  testing,  analysis  and  verif¬ 
ication  into  one  overall  strategy  for  incrementally  raising  the  confidence  in  software  in  a  cost-effective  way.  The 
paper  summarizes  the  strengths,  weaknesses,  and  operational  characteristics  of  dynamic  testing,  static  analysis, 
symbolic  execution  and  formal  verification.  It  uses  a  detailed  example  as  an  illustration.  Next  the  integrated  stra¬ 
tegy  is  presented.  Finally,  there  is  a  discussion  of  how  this  strategy  can  be  used  to  raise  confidence  in  software 
requirements  and  design  specifications  as  well. 

[Ost*83]  Abstract:  This  paper  discusses  the  goals  and  methods  of  the  Toolpack  project  and  in  this  context 
discusses  the  architecture  and  design  of  the  software  system  being  produced  as  the  focus  of  the  project.  Toolpack 
is  presented  as  an  experimental  activity  in  which  a  large  software  tool  environment  is  being  created  for  the  pur¬ 
pose  of  general  distribution  and  then  careful  study  and  analysis.  The  paper  begins  by  explaining  the  motivation 
for  building  integrated  tool  sets.  It  then  proceeds  to  explain  the  basic  requirements  that  an  integrated  system  of 
tools  must  satisfy  in  order  to  be  successful  and  to  remain  useful  both  in  practice  and  as  an  experimental  object. 
The  paper  then  summarizes  the  tool  capabilities  that  will  be  incorporated  into  the  environment.  It  then  goes  on 
to  present  a  careful  description  of  the  actual  architecture  of  the  Toolpack  integrated  tool  system.  Finally  the 
Toolpack  project  experimental  plan  is  presented,  and  future  plans  and  directions  are  summarized. 

[OstcM]  Abstract:  This  paper  presents  a  view  of  how  various  testing,  analysis  and  debugging  techniques  can  be 
integrated  into  a  tool  supported  methodology.  The  paper  is  composed  of  two  major  components.  In  the  first,  the 
techniques  are  described  in  detail,  compared  and  contrasted.  An  integrating  methodology  is  proposed.  The 
second  component  of  the  paper  deals  with  Toolpack,  a  specific  ensemble  of  tools  having  goals  similar  to  those 
described  in  the  first  component,  and  1ST,  the  Integrated  System  of  Tools,  an  integration  strategy  for  these  tools. 
This  second  part  of  the  paper  indicates  how  Toolpack/IST  could  be  configured  into  a  system  capable  of  imple¬ 
menting  the  integrated  strategy  of  the  first  section  in  an  efficient,  effective  way. 

[Ostc87]  Abbreviated  Introduction:  In  this  paper  we  have  suggested  that  the  notion  of  a  “process  program” 
-namely  an  object  which  has  been  created  by  a  development  process,  and  which  is  itself  a  software  process 
description-should  become  a  key  focus  of  software  engineering  research  and  practice.  We  believe  that  the 
essence  of  software  engineering  is  the  study  of  effective  ways  of  developing  process  programs  and  of  maintaining 
their  effectiveness  in  the  face  of  the  need  to  make  changes. 
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The  main  suggestions  presented  here  revolve  around  the  notion  that  process  programs  must  be  defined  in 
a  precise,  powerful  and  rigorous  formalism,  and  that  once  this  has  been  done,  the  key  activities  of  development 
and  evolution  of  both  process  programs  themselves  and  applications  programs  can  and  should  be  carried  out  in  a 
more  or  less  uniform  way. 

This  strongly  suggests  the  importance  of  devising  a  process  programming  language  and  a  software  environ¬ 
ment  capable  of  compiling  and  interpreting  process  programs  written  in  that  language.  Such  an  environment 
would  become  a  vehicle  for  the  organization  of  tools  for  facilitating  development  and  maintenance  of  both  the 
specified  process,  and  the  process  program  itself.  It  would  also  provide  a  much  needed  mechanism  for  providing 
substantive  support  for  software  measurement  and  management. 

[OstrSO]  Overview:  Testing  is  the  most  common  way  of  gaining  confidence  in  the  correctness  of  software. 
Despite  a  long  history  of  practical  testing  experience,  it  is  only  during  the  last  five  years  that  researchers  have 
attempted  to  formulate  a  theoretical  foundation  for  testing. 

The  initial  steps  in  this  direction  where  taken  by  Goodenough  and  Gerhart,  who  formulated  an  ambitious 
theory  which  described  the  conditions  under  which  a  program  can  be  determined  to  be  correct  by  testing. 

The  problems  of  a  theory  based  on  the  concept  of  ideal  tests  are  of  three  types:  formal  unsolvability, 
impracticality,  and  unrealistic  assumptions.  As  we  shall  see,  theories  with  less  ambitious  goals  as  well  as  various 
methodologies  used  in  practice,  must  also  face  these  problems. 

[Ostr84]  Abstract:  A  study  has  been  made  of  the  software  errors  committed  during  development  of  an  interac¬ 
tive  special-purpose  editor  system.  This  product,  developed  for  commercial  production  use,  has  been  followed 
during  nine  months  of  coding,  unit  testing,  function  testing,  and  system  testing.  Detected  problems  and  their 
fixes  have  been  described  by  testers  and  debuggers.  A  new  fault  categorization  scheme  was  developed  from  these 
descriptions  and  used  to  classify  the  173  faults  that  resulted  from  the  project’s  errors.  For  each  error,  we  asked 
the  programmers  to  select  its  most  likely  cause,  report  the  stages  of  the  software  development  cycle  in  which  the 
error  was  committed  and  the  problem  first  noted,  and  the  circumstances  of  the  problem’s  detection  and  isola¬ 
tion,  including  time  required,  techniques  tried,  and  successful  techniques.  The  results  collected  in  this  study  are 
compared  to  results  from  earlier  studies,  and  similarities  and  differences  are  noted. 

[Ostr86]  Abstract:  The  overall  goal  of  software  testing  is  to  expose  errors  that  exist  in  program  code.  The 
specific  goal  of  specification-based  or  black-box  test  case  design  is  to  create  a  series  of  test  cases  that  fully  exer¬ 
cise  the  functionality  of  the  software.  To  achieve  this  goal,  it  is  necessary  to  insure  a  systematic  and  comprehen¬ 
sive  treatment  of  the  specification.  Such  a  treatment  is  particularly  difficult  when  the  specification  is  a  large, 
evolving  document,  written  in  a  natural  language,  and  test  case  design  is  to  be  performed  by  a  multi-person  team. 
Even  if  only  some  of  these  factors  are  present,  a  strategy  is  needed  to  assure  that  the  specification  has  been  com¬ 
pletely  considered.  As  the  size  of  the  specification  and  the  test  team  grows,  the  need  for  a  tool  to  manage  the 
process  becomes  more  pressing.  This  paper  describes  a  strategy  for  managing  specification-based  testing,  pro¬ 
poses  such  a  tool,  and  describes  its  use. 

[OstrS8]  Abbreviated  Introduction:  A  method  for  creating  functional  test  suites  has  been  developed  in  which  a 
test  engineer  analyzes  the  system  specification,  writes  a  series  of  formal  test  specifications,  and  then  uses  a  gen¬ 
erator  tool  to  produce  test  descriptions  from  which  test  scripts  are  written.  The  advantages  of  this  method  are 
that  the  tester  can  easily  modify  the  test  specification  when  necessary,  and  can  control  the  complexity  and 
number  of  the  tests  by  annotating  the  test  specifications  with  constraints. 

[Otte79]  Abstract:  A  major  portion  of  the  problems  associated  with  software  development  might  be  blamed  on 
the  lack  of  appropriate  tools  to  aid  in  the  planning  and  testing  phases  of  software  projects.  As  one  step  towards 
solving  this  problem,  this  paper  presents  a  model  to  estimate  the  number  of  bugs  remaining  in  the  system  at  the 
beginning  of  the  testing  and  integration  phases  of  development.  The  model,  based  on  software  science  metrics, 
was  tested  using  data  currently  available  in  the  literature.  Extensions  to  the  model  are  also  presented  which  can 
be  used  to  obtain  such  estimates  as  the  expected  amount  of  personnel  and  computer  time  required  for  project 
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validation. 

[OtteSl]  Abstract:  An  earlier  paper  presented  a  model  based  on  software  science  metrics  to  give  quantitative 
estimate  of  the  number  of  bugs  in  a  programming  project  at  the  time  validation  of  the  project  begins.  In  this 
paper,  we  report  the  results  from  an  attempt  to  expand  the  model  to  estimate  the  total  number  of  bugs  expected 
during  the  total  project  development.  This  new  hypothesis  has  been  tested  using  the  data  currently  available  in 
the  literature  along  with  data  from  student  projects.  The  model  fits  the  published  data  reasonably  well,  however, 
the  results  obtained  using  the  student  data  are  not  conclusive. 

[Owic75]  Abstract:  A  language  for  parallel  programming,  with  a  primitive  construct  for  synchronization  and 
mutual  exclusion,  is  presented.  Hoare’s  deductive  system  for  proving  partial  correctness  of  sequential  programs 
is  extended  to  include  the  parallelism  described  by  the  language.  The  proof  method  lends  insight  into  how  one 
should  understand  and  present  parallel  programs.  Examples  are  given  using  several  of  the  standard  problems  in 
the  literature.  Methods  for  proving  termination  and  the  absence  of  deadlock  are  also  given. 

[Owic76]  Abstract:  An  axiomatic  method  for  proving  a  number  of  properties  of  parallel  programs  is  presented. 
Hoare  has  given  a  set  of  axioms  for  partial  correctness,  but  they  are  not  strong  enough  in  most  cases.  This  paper 
defines  a  more  powerful  deductive  system  which  is  in  some  sense  complete  for  partial  correctness.  A  crucial 
axiom  provides  for  the  use  of  auxiliary  variables,  which  are  added  to  a  parallel  program  as  an  aid  to  proving  it 
correct.  The  information  in  a  partial  correctness  proof  can  be  used  to  prove  such  properties  as  mutual  exclusion, 
freedom  from  deadlock,  and  program  termination.  Techniques  for  verifying  these  properties  are  presented  and 
illustrated  by  application  to  the  dining  philosophers  problem. 

[Owic82]  Abstract:  A  liveness  property  asserts  that  program  execution  eventually  reaches  some  desirable  state. 
While  termination  has  been  studied  extensively,  many  other  liveness  properties  are  important  for  concurrent  pro¬ 
grams.  A  formal  proof  method,  based  on  temporal  logic,  for  deriving  liveness  properties  is  presented.  It  allows  a 
rigorous  formulation  of  simple  informal  arguments.  How  to  reason  with  temporal  logic  and  how  to  use  safety 
(invariance)  properties  in  proving  liveness  is  shown.  The  method  is  illustrated  using,  first,  a  simple  programming 
language  without  synchronization  primitives,  then  one  with  semaphores.  However,  it  is  applicable  to  any  pro¬ 
gramming  language. 

[Paig72]  Abbreviated  Introduction:  In  this  paper,  we  present  a  technique  for  applying  some  fundamental  flow 
graph  concepts  to  computer  programs  to  yield  some  quantitative  measurement  of  software  complexity.  Due  to 
the  lack  of  any  complete  testing  facility,  it  is  important  to  order  or  rank  the  priorities  in  which  subroutines  or  por¬ 
tions  of  subroutines  should  be  tested.  In  this  manner,  since  all  subroutines  cannot  be  completely  checked  out,  at 
least  the  more  critical  segments  can  be  flagged  for  testing. 

[Paig75]  Abstract:  Current  interests  in  software  engineering  have  posed  serious  questions  about  the  evolution  of 
programs  and  languages.  Computer  programs  are  not  simply  collections  of  statements;  they  involve  specific 
structural  relationships  between  the  program  elements.  Program  structure  has  been  discussed  as  being  an  impor¬ 
tant  influence  on  the  ease  with  which  programs  can  be  constructed,  verified,  understood,  and  changed.  The  dis¬ 
cipline  of  “structured  programming”  has  been  developed  because  computer  scientists  have  sought  to  better  con¬ 
trol  and  understand  the  programming  process. 

Program  graphs  have  been  used  as  a  vehicle  to  focus  attention  on  the  structure  of  a  program.  In  this  paper 
a  systematic  methodology  for  partitioning  a  program  graph  (digraph)  to  highlight  the  relationships  between  pro¬ 
gram  elements  is  introduced  along  with  an  attendant  notation.  This  notation  is  described  in  purely  mathematical 
terms  in  the  first  section,  and  then  the  programming-related  implications  of  this  approach  are  addressed  in  the 
second  section. 

[Palg77b]  Abstract:  In  recent  years,  applications  of  graph  theory  to  computer  software  have  given  fruitful 
results  and  attracted  more  and  more  attention.  A  program  graph  is  a  graph  structural  model  of  a  program 
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exhibiting  the  flow  relation  of  connection  among  the  elements  (statements)  in  the  program. 

One  particular  aspect  of  graph  analysis  which  is  extremely  useful  for  software  is  that  of  partitioning,  since 
it  both  reduces  the  complexity  of  the  system  and  highlights  the  actual  system  composition. 

The  purpose  of  this  paper  is  to  review  and  discuss  given  approaches  to  partitioning  graphs.  These  tech¬ 
niques  are  best  identified  by  the  names  of  the  units  into  which  the  program  is  grouped:  segments,  DD-paths, 
intervals,  classes,  and  level-i  paths.  The  objective  here  is  to  review  these  techniques  on  a  fundamental  level 
without  exhausting  all  the  uses  and  users  of  each  approach. 

[Paig78a]  Abstract:  This  paper  describes  a  quantitative  software  testing  methodology  for  non-structured  and 
structured  programs.  The  paper  first  treats  some  of  the  recent  work  by  McCabe  and  Paige  which  has  developed 
the  groundwork  for  a  quantitative  analysis  of  software  testing.  This  perspective  has  set  the  stage  for  use  of  a  pro¬ 
gram-graph  basis  as  the  thread  for  the  software  testing  effort.  A  basis  is  a  set  of  paths  such  that  any  other  path  in 
the  graph  can  be  expressed  as  a  combination  of  paths  in  the  basis.  A  technique  for  generating  a  unique,  practical 
basis  for  a  program-graph  is  introduced.  The  strategy  for  testing  programs  using  this  basis  is  discussed.  The  final 
section  treats  the  simplifying  effect  of  structured  programs  on  this  testing  approach. 

[Paig81]  Abstract:  A  complete  software  testing  process  must  concentrate  on  examination  of  the  software 
characteristics  as  they  may  impact  reliability.  Software  testing  has  largely  been  concerned  with  structural  tests, 
that  is,  test  of  program  logic  flow.  In  this  paper,  a  comparison  software  test  technique  for  the  program  data 
called  data  space  testing  is  described. 

An  approach  to  data  space  analysis  is  introduced  with  an  associated  notation.  The  concept  is  to  identify 
the  sensitivity  of  the  software  to  a  change  in  a  specific  data  item.  The  collective  information  on  the  sensitivity  of 
the  program  to  all  data  items  is  used  as  a  basis  for  test  selection  and  generation  of  input  values. 

[Panz76]  Abstract:  A  test  procedure  is  a  formal  specification  of  test  cases  to  be  applied  to  one  or  more  target 
program  modules.  Test  procedures  are  executable.  A  process  called  the  VERIFIER  applies  a  test  procedure  to 
its  target  modules  and  produces  an  exception  report  indicating  which  test  cases,  if  any,  failed. 

Test  procedures  facilitate  thorough  software  testing  by  allowing  individual  modules  or  arbitrary  groups  of 
modules  to  be  thoroughly  tested  outside  the  environment  in  which  they  will  eventually  reside.  Test  procedures 
are  complete,  self-contained,  self-validating  and  execute  automatically.  Test  procedures  are  a  deliverable  product 
of  the  software  development  process  and  are  used  for  both  initial  checkout  and  subsequent  regression  testing  of 
target  program  modifications. 

Test  procedures  are  coded  in  a  new  language  called  TPL  (Test  Procedure  Language).  The  paper  analyzes 
current  testing  practices,  describes  the  structure  and  design  of  test  procedures  and  introduces  the  Fortran  Test 
Procedure  Language. 

[Panz78a]  Abstract:  Typical  testing  activities  may  involve  many  hundreds  of  tests.  An  automatic  software  test 
driver  assists  the  tester  by  managing  all  of  the  test  data,  and  automatically  running  the  tests.  Savings  during 
regression  testing  can  be  significant. 

[Panz78b]  Abbreviated  Introduction:  The  execution  of  software  test  cases  and  the  verification  of  test  results 
may  be  performed  automatically  by  a  new  type  of  program  called  an  automatic  software  test  driver.  When  using 
an  automatic  test  driver,  a  formal  test  procedure  is  coded  in  a  special  test  language.  The  test  procedure  takes  the 
place  of  the  test  data  and  test  setup  instructions  of  conventional  testing,  and  control  the  automatic  test  driver.  An 
automatic  test  driver  applies  one  test  procedure  to  all  or  part  of  a  target  program,  executes  all  of  the  test  cases 
specified  in  the  test  procedure,  and  verifies  that  the  results  of  each  test  case  are  correct.  This  paper  describes  the 
Fortran  Test  Procedure  Language  (TPL/F)  which  was  developed  at  General  Electric  and  is  used  for  specifying 
test  procedures  for  Fortran  software. 

The  specific  goals  of  the  TPL/F  automatic  test  driver  are  as  follows.  The  need  for  writing  drivers  and  stubs 
for  module  and  subsystem  testing  is  eliminated  since  the  TPL/F  system  can  test  any  combination  of  one  or  more 
modules  independently  of  the  rest  of  the  target  program.  The  TPL/F  test  language  provides  a  standard  format  for 
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specifying  software  tests  and  the  test  procedure  processor  provides  a  standard  test  execution  setup.  Since  the 
formal  test  procedures  specify  the  correct  outcomes  of  test  cases,  the  test  procedure  processor  automates  the 
verification  of  test  execution  results. 

[Panz78c]  An  automatic  software  test  driver  is  a  new  type  of  software  tool  which  controls  and  monitors  the  exe¬ 
cution  of  software  tests.  An  automatic  test  driver  is  controlled  by  a  formal  test  procedure  coded  in  a  special 
software  test  language.  The  test  procedure  replaces  the  test  data  and  test  setup  instructions  of  conventional  test¬ 
ing.  The  specific  goals  of  automatic  test  drivers  are  to  eliminate  the  need  for  writing  drivers  and  stubs  for  module 
and  subsystem  testing,  to  provide  a  standard  format  and  language  for  specifying  software  tests,  to  provide  a  stan¬ 
dard  execution  setup  for  software  tests,  and  to  automate  the  verification  of  test  execution  results. 

A  test  procedure  contains  input  data  to  be  supplied  to  the  program  under  test  and  model  outputs  against 
which  actual  outputs  of  the  target  program  are  verified.  Typically,  ninety  percent  or  more  of  the  text  of  a  test  pro¬ 
cedure  consists  of  model  outputs  which  must  be  revised  each  time  the  target  program  is  modified.  The  TPL/2.0 
automatic  software  test  driver  described  in  this  paper  automates  both  the  initial  generation  and  subsequent  revi¬ 
sion  of  test  procedure  model  outputs. 

[Parn72a]  Introduction:  In  two  earlier  reports,  we  have  suggested  some  techniques  to  be  used  producing 
software  with  many  programmers.  The  techniques  were  especially  suitable  for  software  which  would  exist  in 
many  versions  due  to  modifications  in  methods  or  applications.  These  techniques  have  been  taught  in  an  under¬ 
graduate  course  and  used  in  an  experimental  project  in  that  course.  The  purpose  of  this  report  is  to  describe  the 
results  that  have  been  obtained  and  to  discuss  some  conclusions  which  we  have  reached.  The  experiment  was 
completely  uncontrolled,  the  programmers  generally  inexperienced  and  poor,  and  the  programming  system  used 
was  not  designed  for  the  task.  The  numerical  data  presented  below  have  no  real  value.  We  include  them  primarily 
as  an  illustration  of  the  type  of  result  that  can  be  obtained  by  use  of  the  techniques  described  in  the  earlier 
reports.  We  consider  these  results  a  drastic  improvement  over  the  state  of  the  art.  Major  changes  in  a  system  can 
be  confined  to  well-defined,  small,  subsystems.  No  intellectual  effort  is  required  in  the  final  assembly  or  “integra¬ 
tion”  phase. 

[Parn72b]  This  paper  discusses  modularization  as  a  mechanism  for  improving  the  flexibility  and  comprehensibil¬ 
ity  of  a  system  while  allowing  the  shortening  of  its  development  time.  The  effectiveness  of  a  “modularization”  is 
dependent  upon  the  criteria  used  in  dividing  the  system  into  modules.  A  system  design  problem  is  presented  and 
both  a  conventional  and  unconventional  decomposition  are  described.  It  is  shown  that  the  unconventional 
decompositions  have  distinct  advantages  for  the  goals  outlined.  The  criteria  used  in  arriving  at  the  decomposi¬ 
tion,  if  implemented  with  the  conventional  assumption  that  a  module  consists  of  one  or  more  subroutines,  will  be 
less  efficient  in  most  cases.  An  alternative  approach  to  implementation  which  does  not  have  this  effect  is 
sketched. 

[Parn72c]  Abstract:  This  paper  presents  an  approach  to  writing  specifications  for  parts  of  software  systems.  The 
main  goal  is  to  provide  specifications  sufficiently  precise  and  complete  that  other  pieces  of  software  can  be  writ¬ 
ten  to  interact  with  the  piece  specified  without  additional  information.  The  secondary  goal  is  to  include  in  the 
specification  no  more  information  than  necessary  to  meet  the  first  goal.  The  technique  is  illustrated  by  means  of 
a  variety  of  examples  from  a  tutorial  system. 

[Pam74]  Abstract:  This  paper  discusses  the  use  of  the  term  “hierarchically  structured”  to  describe  the  design  of 
operating  systems.  Although  the  various  uses  of  this  term  are  often  considered  to  be  closely  related,  close  exami¬ 
nation  of  the  use  of  the  term  shows  that  it  has  a  number  of  quite  different  meanings.  For  example,  one  can  find 
two  different  senses  of  “hierarchy”  in  a  single  operating  system.  An  understanding  of  the  different  meanings  of 
the  term  is  essential,  if  a  designer  wishes  to  apply  recent  work  in  Software  Engineering  and  Design  Methodology. 
This  paper  attempts  to  provide  such  an  understanding. 

[Parn77]  This  paper  discusses  the  role  of  formal  and  precise  specifications  in  the  methodical  development  of 
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software  which  we  know  to  be  correct.  The  differences  between  the  general  use  of  the  work  “specification”  and 
the  engineering  use  of  that  term  are  discussed.  The  software  development  tasks  that  we  are  undertaking  require  a 
“divide  and  conquer”  approach  that  can  only  succeed  if  we  have  a  precise  way  of  describing  the  subproblems.  It 
is  shown  how  predicate  transformers  and  abstract  specifications  can  be  used  when  design  decisions  are  made. 
Two  examples  of  the  use  of  abstract  specifications  are  described  and  detailed  specifications  are  included. 

[Parn78]  Designing  software  to  be  extensible  and  easily  contracted  is  discussed  as  a  special  case  of  design  for 
change.  A  number  of  ways  that  extension  and  contraction  problems  manifest  themselves  in  current  software  are 
explained.  Four  steps  in  the  design  of  software  that  is  more  flexible  are  then  discussed.  The  most  critical  step  is 
the  design  of  a  software  structure  called  the  “uses”  relation.  Some  criteria  for  design  decisions  are  given  and 
illustrated  using  a  small  example.  It  is  shown  that  the  identification  of  minimal  subsets  and  minimal  extensions 
can  lead  to  software  that  can  be  tailored  to  the  needs  of  a  broad  variety  of  users. 

[Parn79]  [The  author  has]  have  been  asked  to  discuss  the  chapter  “An  Appraisal  of  Program  Specifications,”  by 
Liskov  and  Berzins.  Since  it  would  appear  that  the  authors  and  [the  author]  are  in  fundamental  agreement  on  the 
purpose  of  program  specifications,  [the  author]  will  say  little  about  our  common  position  and  focus  on  the  areas 
where  [the  authors]  perception  of  the  role  of  specifications  seems  to  differ  somewhat  from  that  of  the  authors. 
Most  of  [the  authors]  comments  are  based  on  [his]  experience  in  using  both  formal  and  informal  program  specifi¬ 
cations  in  a  variety  of  programming  projects  since  1970. 

Interest  in  the  topic  of  program  specifications  derives  from _ the  design  of  a  software 

system  is  a  large  and  complex  task.  It  is  important  that  the  designers  be  able  to  record  the  intermediate  design 

decisions _ also  useful  to  be  able  to  evaluate  the  design  decisions  using _ 

_ establish  ed  criteria.  All  of  the  concepts  mentioned  in  the  Liskov-Berzins _ are  tools  for  these 

purposes.  My  view  of  the  way  that  formal _ design  decisions  can  be  used  during  the  program  develop¬ 
ment  process _ in  [1]. 

[Pam85]  Abbreviated  Introduction:  This  report  comprises  eight  short  papers  that  were  completed  while  [the 
author]  was  a  member  of  the  Panel  on  Computing  in  Support  of  Battle  Management,  convened  by  the  Strategic 
Defense  Initiative  Organization  (SDIO).  SDIO  is  part  of  the  Office  of  the  US  Secretary  of  Defense.  The  panel 
was  asked  to  identify  the  computer  science  problems  that  would  have  to  be  solved  before  an  effective  antiballistic 
missile  (ABM)  system  could  be  deployed.  It  is  clear  to  everyone  that  computers  must  play  a  critical  role  in  the 
systems  that  SDIO  is  considering.  The  essays  that  constitute  this  report  were  written  to  organize  [the  author’s] 
thoughts  on  these  topics  and  were  submitted  to  SDIO  with  (the  author’s]  resignation  from  the  panel. 

[Parn88]  Abbreviated  Introduction:  Under  AECB  Projects  No.  2.127.1  and  No.  2.127.2  members  of  the  Depart¬ 
ment  of  Computing  and  Information  Science  of  Queen’s  University  [the  authors]  were  asked  to  review  the 
software  being  prepared  to  control  the  two  shutdown  systems  for  the  nuclear  reactors  at  the  Darlington  generat¬ 
ing  station.  New  Canadian  nuclear  generating  stations  have  two  shutdown  systems,  each  independent  of  the 
other  and  both  independent  of  the  reactivity  and  process  control  system.  Although  earlier  Ontario  Hydro  gen¬ 
erating  stations  used  computers  for  reactivity  control,  the  shutdown  systems  had  been  kept  as  simple  as  possible 
and  were  built  using  hardwired  logic.  In  the  Darlington  plant  both  shutdown  systems  (SDS-1  and  SDS-2)  will  be 
controlled  by  computer  systems.  A  significant  factor  in  the  reliability  and  safety  of  those  systems  will  be  the  relia¬ 
bility  and  trustworthiness  of  the  software. 

[The  authors]  were  asked  to  examine  the  software  and  software  documentation  for  SDS-1  and  SDS-2  to 
determine  whether  they  meet  appropriate  standards  and  whether  they  could  be  certified  to  be  sufficiently  depend¬ 
able  for  such  a  critical  application,  (i.e.,  whether  the  documentation  would  enable  a  detailed  safety  evaluation  of 
the  software  to  be  carried  out  in  a  later  project  phase). 

[Parr80]  Abstract:  A  new  model  of  the  software  development  process  is  presented  and  used  to  derive  the  form 
of  the  resource  consumption  curve  of  a  project  over  its  life  cycle.  The  function  obtained  differs  in  detail  from  the 
Rayleigh  curve  previously  used  in  fitting  actual  project  data.  The  main  advantage  of  the  new  model  is  that  it 


320 


August  9, 1989 


relates  the  rate  of  progress  which  can  be  achieved  in  developing  software  to  the  structure  of  the  system  being 
developed.  This  leads  to  a  more  testable  theory,  and  it  also  becomes  possible  to  predict  how  the  use  of  structured 
programming  methods  may  alter  patterns  of  life  cycle  resource  consumption. 

[Pate89]  Abstract:  A  key  factor  in  the  acceptance  of  high  level  programming  languages  has  been  the  develop¬ 
ment  of  a  comprehensive  set  of  tools  to  support  the  user.  If  formal  languages  for  specification  are  to  achieve  the 
same  level  of  acceptance,  they  too  will  require  extensive  automated  support.  This  paper  describes  a  set  of  proto¬ 
type  tools  which  are  designed  to  assist  the  developer  in  the  use  of  formal  specification  techniques. 

[Payt82]  Abstract:  This  paper  describes  a  system  of  automated  tools  for  program  generation.  These  tools 
translate  formal  specifications  of  desigi  into  efficient  programs  to  perform  the  stated  task.  Compiler  generation 
techniques  are  applied  to  create  a  general  system  that  is  applicable  to  the  development  of  a  wide  range  of 
software  products.  Usage  of  this  system  formalizes  the  software  development  process  thus  promoting  a  decrease 
in  software  design  and  development  costs  and  easing  the  maintenance  process.  The  software  process  is  not 
bound  to  a  particular  implementation  language  thus  software  portability  is  enhanced. 

[Pear84]  Abbreviated  Preface:  This  book  is  about  heuristics,  popularly  known  as  rules  of  thumb,  educated 
guesses,  intuitive  judgments  or  simply  common  sense.  In  more  precise  terms,  heuristics  stand  for  strategies  using 
readily  accessible  though  loosely  applicable  information  to  control  problem-solving  processes  in  human  beings 
and  machine.  This  book  presents  an  analysis  of  the  nature  and  the  power  of  typical  heuristic  methods,  primarily 
those  used  in  artificial  intelligence  (AI)  and  operations  research  (OR)  to  solve  problems  of  search,  reasoning, 
planning  and  optimization  on  digital  machines. 

The  discussions  in  this  book  follow  a  three-phase  pattern:  Presentation,  characterization,  and  evaluation. 
We  first  present  a  set  of  general-purpose  problem-solving  strategies  guided  by  heuristic  information,  then 
highlight  the  general  principles  and  properties  that  characterize  this  set  and,  finally,  we  present  mathematical 
analyses  of  the  performances  of  these  strategies  in  several  well-structured  domains.  Some  psychological  aspects 
of  how  people  discover  and  use  heuristics  are  discussed  briefly. 

[Perk8ti]  Abstract:  Metrics  researchers  are  currently  in  the  early  stages  of  validating  the  relationship  between 
metrics  and  the  quality  problems  encountered  by  users  and  developers  of  software.  In  order  to  establish  these 
relationships,  large  amounts  of  data  defined  for  validating  specific  metrics  must  be  collected.  Before  performing 
such  costly  validation,  we  believe  the  metrics  should  be  evaluated  with  respect  to  whether  they  reflect  our  current 
understanding  of  quality  principles.  Our  preliminary  attempt  at  validation  focuses  on  a  human  vs.  automated 
approach  to  analyzing  an  existing  Ada  program.  The  program  consists  of  fourteen  packages  and  approximately 
150  procedures  and  functions.  Segments  of  this  code  were  selected  and  analyzed  with  respect  to  the  software 
quality  sub-criteria  of  flow  simplicity,  limited  visibility,  and  error  prevention  and  detection.  The  study  focuses  on 
disagreements  between  human  and  automated  analysis,  and  attempts  to  explain  those  discrepancies  and  suggest 
possible  ways  to  improve  both  measurement  techniques  and  the  quality  of  the  software  program  analyzed. 

[Perk87]  Abstract:  Our  investigation  applies  an  automated,  hierarchical,  Ada-specific  software  metrics  frame¬ 
work  to  Navy-supplied  Ada  software  to  determine  the  effectiveness  of  such  a  framework  as  an  aid  to  improving 
the  quality  of  Ada  software. 

The  metrics  framework  measures  six  software  criteria  and  consists  of  approximately  150  software  metric 
elements,  where  each  metric  element  relates  a  software  quality  principle  to  the  use  of  specific  fea.ures  of  the  Ada 
language. 

The  investigation  involves:  1)  analysis  of  the  metric  scores  for  the  Navy-supplied  Ada  code,  2)  modifica¬ 
tions  of  the  Ada  code  to  correct  the  quality  problems  indicated  by  the  metric  scores,  resulting  in  two  improved 
versions  of  the  code  (the  first  incorporates  only  statement-by-statement  changes  and  the  second  incorporates 
changes  to  the  overall  organization  of  the  code),  and  3)  comparison  of  metric  scores  for  the  three  versions  of  the 
Ada  code. 
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[Perr83]  Tabic  of  Contents:  Establishing  a  test  methodology.  Establishing  a  system  test  policy,  life  cycle  testing 
approach.  Testing  an  application  system  test  plan.  Developing  an  application  system  test  plan,  testing  tech¬ 
niques,  testing  tools,  requirements  phase  testing,  design  phase  testing,  program  phase  testing,  test  phase  testing, 
installation  phase  testing,  maintenance  phase  testing,  testing  documentation.  Assessing  test  performance. 
Evaluating  the  effectiveness  of  testing.  Testing  tools.  Testing  metrics.  Bibliography. 

[Perr86]  Table  of  Contents:  Will  the  computer  do  what  I  want  to  do?  What  can  go  wrong  with  computerized 
applications,  and  what  to  do  about  it,  testing  business  fit,  testing  system  fit,  testing  people  fit.  Does  the  software 
work  correctly?  Developing  a  test  plan,  creating  testing  conditions,  verifying  the  correctness  of  the  software 
functions.  So  now  the  software  is  in  operation!  Validation  computer-produced  output.  Glossary.  Golden  rules. 
Index. 

[Pesc85]  Abstract:  For  the  validation  of  the  kernel  system  calls  of  a  family  of  UNIX  systems  a  knowledge  based 
test  environment  was  conceived.  A  prototype  version  is  currently  implemented  in  Prolog.  The  knowledge  base 
consists  essentially  of  three  parts: 

•  test  case  specifications  of  the  various  system  calls. 

•  a  test  suite  generator  with  predicates  including  information  about  UNIX  system  properties  and  sound  test 
practices,  and 

•  a  test  protocol  archive  including  utilities  to  extract  and  prepare  reports  about  the  test  results. 

All  information  in  the  knowledge  base  is  stored  as  Horn  clauses,  i.e.  facts  and  rules  immediately  to  be  consulted 
and  executed  by  a  Prolog  interpreter. 

[Pete77]  Abstract:  Over  the  last  decade,  the  Petri  net  has  gained  increased  usage  and  acceptance  as  a  basic 
model  of  systems  of  asynchronous  concurrent  computation.  This  paper  surveys  the  basic  concepts  and  uses  of 
Petri  nets.  The  structure  of  Petri  nets,  their  markings  and  execution,  several  examples  of  Petri  net  models  of 
computer  hardware  and  software,  and  research  into  the  analysis  of  Petri  nets  are  presented,  as  are  the  use  of  the 
reachability  tree  and  the  decidability  and  complexity  of  some  Petri  net  problems.  Petri  net  languages,  models  of 
computation  related  to  Petri  nets,  and  some  extensions  and  subclasses  of  the  Petri  net  model  are  also  briefly  dis¬ 
cussed. 

[Pets85]  Introduction:  Selecting  test  cases  for  system  testing  of  the  PICS/DCPR  database  application  poses  a 
fundamental  problem.  Due  to  the  size  of  the  system,  methodologies  described  in  the  literature  do  not  apply. 
Their  formulations  of  “thorough  testing”  require  so  many  test  cases  that  they  are  not  practical  for  system  testing. 

To  deal  with  this  problem,  we  have  adopted  an  approach  to  test  case  selection  that  uses  a  simple  set  of 
priority  rules  to  judge  which  test  cases  are  more  important  than  others.  These  priority  rules  derive  from  the  char¬ 
ter  of  the  system  test  group  in  the  PICS/DCPR  project,  observations  about  developer  testing,  and  the  conse¬ 
quences  of  different  types  of  software  defects  on  users. 

These  practical  priority  rules  are  believed  to  constitute  a  more  realistic  approach  to  system-testing  large 
database  applications  than  current  theory  does. 

[Ptmo75]  Abstract:  This  paper  deals  with  the  problem  of  assessing  the  reliability  of  programs  written  using  struc¬ 
tured  programming  techniques  and  having  undergone  a  certain  amount  of  testing.  A  program  is  said  to  be  veri¬ 
fied  if,  for  a  given  set  of  tests  it  can  be  shown  that  every  case  of  interest  has  been  tested.  As  this  end  is,  however, 
unattainable,  we  will  consider,  in  the  following,  that  a  program  is  verified  if  one  can  prove  that  all  the  logic  paths 
in  the  program  flow  graph  have  been  traversed.  Therefore,  we  will  consider  that  a  certain  degree  of  verification  is 
attained  with  a  given  set  of  tests,  according  to  the  number  of  paths  actually  traversed.  This  degree  of  verification, 
which  is  a  non-decreasing  function  of  the  number  of  tests  can  be  considered  as  an  assessment  of  program  relia¬ 
bility.  The  degree  of  verification  attained  through  experiments  can  then  be  deduced  from  the  images  of  experi¬ 
ments  in  the  program  flow  graph.  This  paper  defines  a  practical  procedure  to  perform  such  an  evaluation. 

[Pipp78]  Abbreviated  Introduction:  Many  complex  systems  such  as  those  found  in  a  computer  or  a  telephone 
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exchange  are  constructed  by  interconnecting  a  large  number  of  simple  components.  The  complexity  of  these  sys¬ 
tems  arises  from  the  number  of  components  and  the  intricacy  of  their  interconnections,  rather  than  from  any 
great  complexity  of  the  components  themselves.  The  systems  formed  in  this  fashion  are  somehow  much  greater 
than  the  sum  of  their  parts. 

It  is  natural  to  assume  that  every  component  in  a  complex  system  is  there  for  a  reason,  but  although  it  may 
be  true  that  the  removal  of  any  component  would  cause  the  system  to  malfunction,  it  is  also  possible  that  an 
overall  reorganization  would  lead  to  a  working  system  with  many  fewer  components. 

Complexity  theory  seeks  to  determine  the  minimum  number  of  components  needed  for  these  systems.  It 
pursues  this  goal  in  two  ways:  by  finding  new  designs  that  call  for  fewer  components  and  by  showing  that  a  certain 
number  of  components  will  be  needed  no  matter  what  design  is  followed.  Finding  new  designs  for  a  system  has 
an  obvious  practical  significance:  it  can  increase  the  efficiency  of  the  system  and  reduce  its  cost.  The  second  type 
of  investigation,  which  sets  limits  beyond  which  further  attempts  at  improvement  are  futile,  is  equally  necessary 
for  a  complete  understanding  of  a  particular  system  and  is  often  much  harder  to  accomplish. 

[Piwo82]  Abstract:  Two  well-publicized  program  complexity  measures  are  software  science  and  cyclomatic 
complexity.  Three  areas  where  these  measures  do  not  always  follow  our  intuitive  notions  of  complexity  are:  struc¬ 
tured  vs  unstructured  programs,  nested  vs  sequential  predicates,  and  the  use  of  case  statements.  This  paper 
defines  a  nesting  level  complexity  measure  that  punishes  unstructuredness,  and  the  nesting  of  predicates,  and 
rewards  the  use  of  case  statements.  Examples  are  given  where  the  nesting  level  complexity  agrees  with  intuitive 
rankings  of  program  structures  where  software  science,  cyclomatic  complexity,  and  their  suggested  refinements 
do  not. 

[Pooc74]  Abstract:  Decomposition  and  conversion  algorithms  for  translating  decision  tables  are  surveyed  and 
contrasted  under  two  broad  categories:  the  mask  rule  technique  and  the  network  technique.  Also,  decision  table 
structure  is  briefly  covered,  including  checks  for  redundancy,  contradiction,  and  completeness;  decision  table 
notation  and  terminology;  and  decision  table  types  and  applications.  Extensive  literature  citations  are  provided. 

[Popk78]  Abstract:  This  report  discusses  the  use  of  flowchart  graphs,  adjacency  matrices,  and  zero-one  linear 
programming  to  find  the  minimum  number  of  tests  necessary  to  execute  every  segment  of  a  computer  program  at 
least  once.  The  methods  of  Lipow  are  used  as  the  basis  for  determining  the  maximum  incomparable  set,  i.e.,  the 
largest  set  of  program  segments  through  which  one  and  only  one  test  case  should  pass.  The  size  of  the  maximum 
incomparable  set  gives  the  minimum  number  of  tests  necessary  to  execute  each  segment  at  least  once,  while  the 
elements  of  this  set  give  the  paths  of  each  test.  The  report  develops  methods  for  finding  the  maximum  incompar¬ 
able  set  for  loopless  and  some  elementary  looping  flowcharts. 

[Post87]  Introduction:  In  1984,  L&N  authorized  a  software  engineering  project  to  create  a  real-time,  process¬ 
monitoring  program  that  would  be  embedded  in  a  large  process-control  system.  This  undertaking  was  named  the 
Process  Information  Management  Subsystem  (PIMS)  Trending  Project.  Before  undertaking  the  Trending  Pro¬ 
ject,  L&N  was  reporting  failure-density  factors  near  1.3.  The  subsequent  drop  to  0.072  represented  a  95-percent 
improvement  in  software  quality. 

As  well  as  quality  improvements,  we  measured  productivity  increases  on  the  Trending  project  compared 
to  earlier  L&N  projects. 

When  the  Trending  Project  software  was  delivered,  the  engineers  had  produced  29  lines  of  source  code 
per  staff-day.  Even  allowing  for  a  substantial  error  margin  in  the  estimates  for  productivity  factors,  the  gain  was 
more  than  200  percent. 

How  were  these  quality  and  productivity  improvements  achieved?  The  PEI  Testing  Methodology  -  an 
integrated  set  of  policies,  techniques,  metrics,  and  standards  -  was  added  to  the  L&N  quality-assurance  pro¬ 
gram.  The  methodology  concentrates  on  five  improvement  techniques:  (1)  defining  requirements  for  testability, 
(2)  designing  software  for  testability,  (3)  designing  tests  for  most-probable  errors  (see  Software  Standards  in 
May,  July,  and  this  issue),  (4)  designing  tests  before  code  is  designed,  and  (5)  performing  reviews  (inspections 
and  walkthroughs). 
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In  recent  years,  all  the  techniques  included  in  the  methodology  have  been  studied  one  by  one.  Based  on 
these  studies,  software  researchers  predicted  that  when  all  these  techniques  or  modem  programming  practices 
were  used  together,  software  productivity  increases  as  high  as  50  percent  were  possible.  This  case  study  shows 
that  the  synergistic  effect  of  combining  techniques  can  be  even  more  beneficial  than  anticipated. 

[Pout87]  This  paper  presents  two  Ada  testing  tools  that  have  been  developed  in  Nokia  Information  Systems, 
Softplan.  Their  main  properties  are  described.  It  will  be  shown  that  these  tools  have  a  considerable  potential  in 
increasing  the  efficiency  of  testing  (which  is  separate  from  debugging).  These  tools  have  been  written  in  Ada  and 
they  are  fully  independent  of  the  Ada  compilation  system  and  the  operatoring  system  where  they  are  used.  This 
paper  reveals  also  some  of  the  basic  idea  how  this  kind  of  portability  has  been  reached  as  well  as  some  experi¬ 
ences  in  this  respect. 

[Prat80]  Abstract:  A  vigorous  approach  to  evaluating  computer  models  is  presented.  With  a  concise,  21  ques¬ 
tion  worksheet  as  a  basis,  logical  criteria  are  developed  for  determining  whether  a  model  is:  (1)  safe  for  opera¬ 
tional  use  by  managers,  (2)  in  need  of  further  validation,  or  (3)  of  value  only  in  providing  valuable  lessons  for 
future  modeling  work.  A  numeric  figure  of  merit  reflecting  significant  aspects  of  the  cost/benefit  picture  of  the 
model  is  then  developed  as  a  guide  for  determining  which  models  should  be  further  developed  or  implemented. 

[Prat87]  Abstract:  A  new  software  testing  strategy  is  described.  The  strategy  is  “adaptive”  in  that  previous  test 
paths  (inputs)  are  used  as  a  guide  in  the  selection  of  subsequent  paths  (inputs).  Preliminary  implementations 
have  successfully  exploited  the  method’s  inherent  user-interactive  capability.  The  method  ensures  branch  cover¬ 
age,  requires  only  “order  n”  tests  (n  being  the  number  of  decision  nodes  in  the  program  flowgraph),  and  offers 
considerable  advantages  over  existing  strategies  in  its  computational  requirements. 

[Press83]  Abstract:  Software  metrics  (or  measurements)  which  predict  software  quality  were  extended  from 
previous  research  to  include  two  additional  quality  factors:  interoperability  and  reusability.  Aspects  of  require¬ 
ments,  design,  and  source  language  programs  which  could  affect  these  two  quality  factors  were  identified  and 
metrics  to  measure  them  were  defined.  These  aspects  were  identified  by  theoretical  analysis,  literature  search, 
interviews  with  project  managers  and  software  engineers,  and  personal  experience. 

A  guidebook  for  software  quality  measurement  was  produced  to  assist  in  setting  quality  goals,  applying 
metrics  and  making  quality  assessments. 

[Prob82c]  Abstract:  A  standard  technique  for  monitoring  software  testing  activities  is  to  instrument  the  module 
under  test  with  counters  or  probes  before  testing  begins;  then,  during  testing,  data  generated  by  these  probes  can 
be  used  to  identify  portions  of  as  yet  unexercised  code.  In  this  paper  the  effect  of  of  the  disciplined  use  of 
language  features  for  explicitly  delimiting  control  flow  constructs  is  investigated  with  respect  to  the  correspond¬ 
ing  ease  of  software  instrumentation.  In  particular,  assuming  all  control  constructs  are  explicitly  delimited,  for 
example,  by  END  IF  or  equivalent  statements,  an  easily  programmed  method  is  given  for  inserting  a  minimum 
number  of  probes  for  monitoring  statement  and  branch  execution  counts  without  disrupting  source  code  struc¬ 
ture  or  paragraphing.  The  use  of  these  probes,  called  statement  probes,  is  contrasted  with  the  use  of  standard 
(branch)  probes  for  execution  monitoring.  It  is  observed  that  the  results  apply  to  well-delimited  modules  written 
in  a  wide  variety  of  programming  languages,  in  particular,  Ada. 

[Prob84]  Abstract:  A  testing-based  approach  for  constructing  and  refining  very  high-level  software  functionality 
representations  such  as  intentions,  natural  language  assertions,  and  formal  specifications  is  presented  and 
applied  to  a  standard  line-editing  problem  as  an  illustration.  The  approach  involves  the  use  of  specification-based 
(black-box)  testcase  generation  strategies,  high-level  specification  formalisms,  redundant  or  parallel  develop¬ 
ment  and  cross  validation,  and  a  logic  programming  support  environment.  Test-case  reference  sets  are  used  as 
software  functionality  representations  for  the  purposes  of  cross  validating  two  distinct  high-level  representations, 
and  identifying  ambiguities  and  omissions  in  those  representations.  In  fact,  we  propose  the  use  of  successive 
refinements  of  such  test  reference  sets  as  the  authoritative  specification  throughout  the  software  development 
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process.  Potential  benefits  of  the  approach  include  improvements  in  user/designer  communication  over  all  life 
cycle  phases,  and  an  increase  in  the  quality  of  specifications  and  designs. 

[Prot88]  Abstract:  This  study  presents  results  of  a  software  reliability  experiment  that  investigates  the  feasibility 
of  a  new  error  detection  method.  The  method  can  be  used  as  an  acceptance  test  and  is  solely  based  on  empirical 
data  about  the  behavior  of  internal  states  of  a  program.  The  experimental  design  uses  the  existing  environment  of 
multi-version  experiment  previously  conducted  at  the  NASA  Langley  Research  Center,  in  which  the  ‘launch 
interceptor’  problem  is  used  as  a  model  problem.  This  allows  the  controlled  experimental  investigation  of  ver¬ 
sions  with  well-known  single  and  multiple  faults,  and  the  availability  of  an  oracle  permits  the  dr'  rmination  of 
the  error  detection  performance  of  the  test.  Fault-interaction  phenomena  are  observed  that  have  an  amplifying 
effect  on  the  number  of  error  occurrences.  Preliminary  results  indicate  that  all  faults  examined  so  far  are 
detected  by  the  acceptance  test.  This  shows  promise  for  further  investigations,  and  for  the  employment  of  this 
test  method  in  other  applications. 

[Purd72]  Abstract:  A  fast  algorithm  is  given  to  produce  a  small  set  of  short  sentences  from  a  context  free  gram¬ 
mar  such  that  each  production  of  the  grammar  is  used  at  least  once.  The  sentences  are  useful  for  testing  parsing 
programs  and  for  debugging  grammars  (finding  errors  in  a  grammar  which  causes  it  to  specify  some  language 
other  than  the  one  intended).  Some  experimental  results  from  using  the  sentences  to  test  some  automatically 
generated  simple  LR(1)  parsers  are  also  given. 

[Puto78]  Abstract:  Application  software  development  has  been  an  area  of  organizational  effort  that  has  not 
been  amenable  to  the  normal  managerial  and  cost  controls.  Instances  of  actual  costs  of  several  times  the  initial 
budgeted  cost,  and  a  time  to  initial  operational  capability  sometimes  twice  as  long  as  planned  are  more  often  the 
case  than  not. 

A  macromethodology  to  support  management  needs  has  now  been  developed  that  will  produce  accurate 
estimates  of  manpower,  costs,  and  times  to  reach  critical  milestones  of  software  projects.  There  are  four  param¬ 
eters  in  the  basic  system  and  these  are  in  terms  managers  are  comfortable  working  with  -effort,  development 
time,  elapsed  time,  and  a  state-of-technology  parameter. 

The  system  provides  managers  sufficient  information  to  assess  the  financial  risk  and  investment  value  of  a 
new  software  development  project  before  it  is  undertaken  and  provides  techniques  to  update  estimates  from  the 
actual  data  stream  once  the  project  is  underway.  Using  the  technique  developed  in  the  paper,  adequate  analysis 
for  decisions  can  be  made  in  an  hour  or  two  using  only  a  few  quick  reference  tables  and  a  scientific  pocket  calcu¬ 
lator. 

[Pufcn79]  Overview:  Few  managers  are  able  to  predict  the  time  and  resources  needed  to  develop  large-scale 
software  systems.  Progress  is  often  measured  by  the  rate  of  expenditure  of  resources  rather  than  by  some  count 
of  accomplishments.  Unrealistic  estimates  often  result  in  last  minute  efforts  to  get  code  written  quickly,  resulting 
in  cost  overruns  and  poor  quality  software. 

Software  development  can  be  brought  under  control.  It  requires  an  understanding  of  how  application 
software  behaves,  what  factors  management  can  control  and  what  factors  are  limited  by  the  process  itself. 

The  basis  of  effective  management  is  the  fact  that  the  software  development  process  exhibits  a  charac¬ 
teristic  behavior,  which  can  be  exploited,  so  that  the  expensive  results  of  unrealistic  approaches  can  be  avoided. 

[Quir85]  Preface:  Real-time  software  poses  serious  problems.  It  fails  too  often  and  the  failures  can  be  both 
extremely  troublesome  and  sometimes  dangerous. 

In  this  report,  the  techniques  available  for  validation  and  verification  of  real-time  systems  software  are 
reviewed.  Material,  which  is  at  present  scattered  through  conference  proceedings,  research  notes  and  journal 
papers,  is  gathered  together  and  presented  in  the  context  of  practical  usefulness.  More  detailed  references  are 
included  wherever  possible. 

[RADC76a]  Abstract:  A  study  of  software  errors  is  presented.  Techniques  for  categorizing  errors  according  to 
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type,  identifying  their  source,  and  detecting  then  are  discussed.  Various  techniques  used  in  analyzing  empirical 
error  data  collected  from  four  large  software  systems  are  discussed  and  results  of  analysis  are  presented.  Use  of 
results  to  indicate  improvements  in  the  error  prevention  and  detection  processes  through  use  of  tools  and  tech¬ 
niques  is  also  discussed. 

A  survey  of  software  reliability  models  is  included,  and  recent  work  on  TRW’s  Mathematical  Theory  of 
Software  Reliability  (MTSR)  is  presented. 

Finally,  lessons  learned  in  conjunction  with  collecting  software  data  are  outlined,  with  recommendations 
for  improving  the  data  collection  process. 

[Rabi77]  Abstract:  The  framework  for  research  in  the  theory  of  complexity  of  computations  is  described, 
emphasizing  the  interrelation  between  seemingly  diverse  problems  and  methods.  Illustrative  examples  of  practi¬ 
cal  and  theoretical  significance  are  given.  Directions  for  new  research  are  suggested. 

[Rama75a]  Abstract:  In  the  past  few  years,  research  has  been  actively  carried  out  in  an  attempt  to  improve  the 
quality  and  reliability  of  large-scale  software  systems.  Although  progress  has  been  made  on  the  formal  proof  of 
program  correctness,  proving  large-scale  software  systems  correct  by  formal  proof  is  still  many  years  away. 
Automated  software  tools  have  been  found  to  be  valuable  in  improving  software  reliability  and  attacking  the  high 
cost  of  software  systems.  This  paper  attempts  to  describe  some  main  features  of  automated  software  tools  and 
some  software  evaluation  systems  that  are  currently  available. 

[Rama76]  Abstract:  Software  validation  through  testing  will  continue  to  be  a  very  important  tool  for  ensuring 
correctness  of  large  scale  software  systems.  Automation  of  testing  tools  can  greatly  enhance  their  power  and 
reduce  testing  cost.  In  this  paper,  techniques  for  automated  test  data  generation  are  discussed.  Given  a  program 
graph,  a  set  of  paths  are  identified  to  satisfy  some  given  testing  criteria.  When  a  path  or  program  segment  is 
specified,  symbolic  execution  is  used  for  generating  input  constraints  which  define  a  set  of  inputs  for  executing 
this  path  or  segment.  Problems  encountered  in  symbolic  execution  are  discussed.  A  new  approach  for  resolving 
array  references  ambiguities  and  a  procedure  for  generating  test  inputs  satisfying  input  constraints  are  proposed. 
References  to  arrays  are  recorded  in  a  table  during  symbolic  execution  and  ambiguities  are  resolved  when  test 
data  are  generated  to  evaluate  the  subscript  expressions.  The  implementation  of  a  test  data  generator  for  Fortran 
programs  incorporating  these  techniques  is  also  described. 

[Ramafil]  Abstract:  This  paper  discusses  the  necessity  of  a  good  methodology  for  the  development  of  reliable 
software,  especially  with  respect  to  the  final  software  validation  and  testing  activities.  A  formal  specification 
development  and  validation  methodology  is  proposed.  This  methodology  has  been  applied  to  the  development 
and  validation  of  a  pilot  software,  incorporating  typical  features  of  critical  software  for  nuclear  power  plant 
safety  protection.  The  main  features  of  the  approach  include  the  use  of  a  formal  specification  language  and  the 
independent  development  of  two  sets  of  specifications.  Analyses  on  the  specification  consists  of  three  parts:  vali¬ 
dation  against  the  functional  requirements,  consistency  and  integrity  of  the  specifications,  and  dual  specification 
comparison  based  on  a  high-level  symbolic  execution  technique.  Dual  design,  implementation,  and  testing  activi¬ 
ties  are  developed  to  support  the  methodology.  These  include  the  symbolic  executor  and  test  data  generator/dual 
program  monitor  system.  The  experiences  of  applying  the  methodology  to  the  pilot  software  are  discussed,  and 
the  impact  on  the  quality  of  the  software  is  assessed. 

[Rama82]  Abstract:  It  is  essential  to  assess  the  reliability  of  digital  computer  systems  used  for  critical  real-time 
control  applications  (e.g.,  nuclear  power  safety  control  systems).  This  involves  the  assessment  of  the  design 
correctness  of  the  combined  hardware/ software  system  as  well  as  the  reliability  of  the  hardware.  In  this  paper  we 
survey  methods  of  determining  the  design  correctness  of  systems  as  applied  to  computer  programs. 

Automated  program  proving  techniques  are  still  not  practical  for  realistic  programs.  Manual  proofs  are 
lengthy,  tedious,  and  error-prone.  Software  reliability  provides  a  measure  of  confidence  in  the  operational 
correctness  of  the  software.  Since  the  early  1970’s  several  software  reliability  models  have  been  proposed.  We 
classify  and  discuss  these  models  using  the  concepts  of  residual  error  size  and  the  testing  process  used.  We  also 
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discuss  methods  of  estimating  the  correctness  of  the  program  and  adequacy  of  the  set  of  tests  used. 

These  methods  are  directly  applicable  to  assessing  the  design  correctness  of  the  total  integrated 
hardware/software  system  which  ultimately  could  include  large  complex  distributed  processing  systems. 

[Rand75]  Abstract:  This  paper  presents  and  discusses  the  rationale  behind  a  method  for  structuring  complex 
computing  systems  by  the  use  of  what  we  term  “recovery  blocks,”  “conversations,”  and  “fault  tolerant  inter¬ 
faces.”  The  aim  is  to  facilitate  the  provision  of  dependable  error  detection  and  recovery  facilities  which  can  cope 
with  errors  caused  by  residual  design  inadequacies,  particularly  in  the  system  software,  rather  than  merely  the 
occasional  malfunctioning  of  hardware  components. 

[Rapp80]  Abstract:  This  paper  examines  a  family  of  program  test  data  selection  criteria  derived  from  data  flow 
analysis  techniques  similar  to  those  used  in  compiler  optimization.  It  is  argued  that  currently  used  path  selection 
criteria  which  examine  only  the  control  flow  of  a  program  are  inadequate.  Our  procedure  associates  with  each 
point  in  a  program  at  which  a  variable  is  defined,  those  points  at  which  the  value  is  used.  Several  related  path  cri¬ 
teria,  which  differ  in  the  number  of  these  associations  needed  to  adequately  test  the  program,  are  defined  and 
compared. 

[Redw83]  Abstract:  A  systematic  approach  to  test  data  design  is  presented  based  on  both  practical  translation 
of  theory  and  organization  of  professional  [XXX].  The  approach  is  organized  around  five  domains  and  achieving 
coverage  (exercise)  of  them  by  the  test  data.  The  domains  are  processing  functions,  input,  output,  interaction 
among  functions,  and  the  code  itself.  Checklists  are  used  to  generate  data  for  processing  functions.  Separate 
checklists  have  been  constructed  for  eight  common  business  data  processing  functions  such  as  editing,  updating, 
sorting,  and  reporting.  Checklists  or  specific  concrete  directions  also  exist  for  input,  output,  interaction,  and 
code  coverage.  Two  global  heuristics  concerning  all  test  data  are  also  used.  A  limited  discussion  on  documenting 
test  input  data,  expected  results,  and  actual  results  is  included. 

Use,  applicability,  and  possible  expansions  are  covered  briefly.  Introduction  of  the  method  has  similar  dif¬ 
ficulties  to  those  experienced  when  introducing  any  disciplined  technique  into  an  area  where  discipline  was  previ¬ 
ously  lacking.  The  approach  is  felt  to  be  easily  modifiable  and  usable  for  types  of  systems  other  than  the  tradi¬ 
tional  business  data  processing  ones  for  which  it  was  originally  developed. 

[Reif75]  Abstract:  Recent  investigations  on  the  use  of  automation  to  realize  the  twin  objectives  of  cost  reduc¬ 
tion  and  reliability  improvement  for  computer  programs  developed  for  the  U.S.  Air  Force  are  reported.  The 
concepts  of  reliability  and  automation  as  they  pertain  to  software  are  explained.  Then,  over  twenty  automated 
tools  and  techniques  (aids)  identified  in  this  investigation  are  described  and  categorized.  Based  on  the  informa¬ 
tion  reviewed,  an  assessment  of  the  state  of  the  technology  is  made.  Finally,  specific  recommendations  which  try 
to  give  direction  to  future  efforts  are  offered. 

[Relf79a]  Abstract:  This  concept  paper  discusses  the  possible  use  of  failure  modes  and  effects  analysis  (FMEA) 
as  a  means  to  produce  more  reliable  software.  FMEA  is  a  fault  avoidance  technique  whose  objective  is  to  iden¬ 
tify  hazards  in  requirements  that  have  the  potential  to  either  endanger  mission  success  or  significantly  impact 
life-cycle  costs.  FMEA  techniques  can  be  profitably  applied  during  the  analysis  stage  to  identify  potential 
hazards  in  requirements  and  design.  As  hazards  are  identified,  software  defenses  can  be  developed  using  fault 
tolerant  or  self-checking  techniques  to  reduce  the  probability  of  their  occurrence  once  the  program  is  imple¬ 
mented.  Critical  design  features  can  also  be  demonstrated  a  priori  analytically  using  proof  of  correctness  tech¬ 
niques  prior  to  their  implementation  if  warranted  by  cost  and  criticality. 

[Reif79b]  Abstract:  Software  tools  can  serve  as  powerful  aids  in  the  design,  development,  test,  and  maintenance 
of  computer  software.  So,  in  light  of  the  recent  growth  in  cost  of  software  relative  to  total  system  cost,  it  should 
come  as  no  surprise  that  the  subject  of  software  tools  has  sparked  a  good  deal  of  interest  throughout  the  com¬ 
puter  industry.  In  response  to  that  interest,  this  paper  provides  a  comprehensive  listing  of  the  software  tools  and 
techniques  currently  available. 
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We  first  describe  the  typical  software  life  cycle  and  its  three  major  stages:  (1)  conceptual  and  require¬ 
ments;  (2)  development;  and  (3)  operations  and  maintenance.  Next  we  describe  the  six  categories  of  software 
tools  (simulation,  development,  test  and  evaluation,  operations  and  maintenance,  performance  measurement, 
and  programming  support).  Table  1  relates  the  life  cycle  areas  to  the  various  categories  of  software  tools. 

[Reyn87]  Abbreviated  Introduction:  The  Partial  Metrics  System  [to  support  the  metrics-driven]  implementation 
of  individual  modules  in  a  large-scale  programming  project  design  is  explained,  with  an  emphasis  on  the  refine¬ 
ment  process.  A  model,  with  its  three  phases,  shows  that  the  pseudocode  refinement  process  can  be  monitored 
in  partial  metric  terms. 

[Reyn89]  Abstract:  This  paper  describes  a  software  tool,  the  partial  metrics  system  (PMS),  that  supports  the 
metrics-driven  design  of  pseudocode  program  modules.  Although  this  is  a  generic  approach  that  is  language 
independent,  we  illustrate  its  application  using  Ada  as  the  target  language.  Each  new  refinement  of  a  pseu¬ 
docode  program  is  assessed  in  terms  of  a  set  of  partial  metrics.  These  metrics  are  extensions  of  Halstead’s 
Software  Science,  McCabe’s  Cyclomatic  Complexity,  and  others. 

It  is  then  demonstrated  how  these  metrics  can  drive  the  design  process  for  an  individual  module.  Heuris¬ 
tics  are  suggested  that  can  allow  the  programmer  to  make  use  of  these  metrics  in  order  to  produce  improved 
designs. 


[Rich81a]  Abstract:  A  major  drawback  of  most  program  testing  methods  is  that  they  ignore  program  specifica¬ 
tions,  and  instead  base  their  analysis  solely  on  the  information  provided  in  the  implementation.  This  paper 
describes  the  partition  analysis  method,  which  assists  in  program  testing  and  verification  by  evaluating  informa¬ 
tion  from  both  a  specification  and  an  implementation.  This  method  employs  symbolic  evaluation  techniques  to 
partition  the  set  of  input  data  into  procedure  subdomains  so  that  the  elements  of  each  subdomain  are  treated  uni¬ 
formly  by  the  specification  and  processed  uniformly  by  the  implementation.  The  partition  divides  the  procedure 
domain  into  more  manageable  units.  Information  related  to  each  subdomain  is  used  to  guide  in  the  selection  of 
test  data  and  to  verify  consistency  between  the  specification  and  the  implementation.  Moreover,  the  test  data 
selection  process,  called  partition  analysis  testing,  and  the  verification  process,  called  partition  analysis  verifica¬ 
tion,  are  used  to  enhance  each  other,  and  thus  increase  program  reliability. 

[Rich82]  Abstract:  The  partition  analysis  method  compares  a  procedure’s  implementation  to  its  specification.  In 
addition  to  verifying  consistency  between  the  two,  this  comparison  is  used  to  derive  test  data.  Unlike  most  test 
data  selection  strategies,  which  consider  only  the  implementation,  partition  analysis  selects  test  data  that  charac¬ 
terize  the  procedure  in  terms  of  its  intended  behavior  as  well  as  the  structure  of  its  implementation.  To  accom¬ 
plish  this,  partition  analysis  divides  or  “partitions”  the  procedure’s  domain  into  subdomains  in  which  all  ele¬ 
ments  of  each  subdomain  are  treated  uniformly  by  the  specification  and  processed  uniformly  by  the  implementa¬ 
tion.  Initial  experimentation  has  shown  that  through  the  integration  of  testing  and  verification,  as  well  as  through 
the  use  of  information  derived  from  both  the  implementation  and  the  specification,  the  partition  analysis  method 
is  effective  for  determining  program  reliability.  This  paper  provides  an  overview  of  the  partition  analysis  method 
and  reports  the  results  obtained  from  preliminary  evaluation  of  its  effectiveness. 

[Rich85a]  Abbreviated  Introduction:  Several  of  the  validation  tools  being  developed  employ  a  method  called 
symbolic  evaluation,  which  creates  a  symbolic  representation  of  the  program.  This  chapter  describes  symbolic 
evaluation  and  surveys  some  of  the  testing  applications  of  this  method. 

Symbolic  evaluation  monitors  the  manipulations  performed  on  the  input  data.  Computations  and  their 
applicable  domain  are  represented  algebraically  over  the  input  domain,  thereby  describing  the  relationship 
between  the  input  data  and  the  resulting  values.  Normal  execution  computes  numerical  values  but  loses  informa¬ 
tion  about  the  way  in  which  these  numerical  values  were  derived,  whereas  symbolic  evaluation  preserves  this 
information.  When  further  analyzed,  this  information  provides  the  basis  for  several  testing  techniques. 

For  the  most  part,  current  testing  research  is  directed  at  either  the  problem  of  determining  the  paths  (the 
particular  sequences  of  statements)  that  must  be  tested  or  the  problem  of  selecting  revealing  test  data  for  the 
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selected  paths.  For  the  path  selection  problem,  techniques  such  as  program  coverage,  data  flow  testing,  and  per¬ 
turbation  testing  have  been  proposed.  For  the  test  data  selection  problem,  a  number  of  informal  guidelines  have 
been  put  forth.  Recently  there  has  been  considerable  work  on  developing  more  systematic  test  data  selection 
techniques  that  can  either  eliminate  certain  classes  of  errors  or  provide  a  quantifiable  error  bound.  Many  of  the 
current  path  selection  and  test  data  selection  techniques  base  their  analyses  on  the  information  provided  by  sym¬ 
bolic  evaluation. 

The  above  testing  techniques  are  referred  to  as  structural  techniques,  since  they  base  their  analysis  solely 
on  the  information  provided  by  a  given  implementation.  There  are  two  drawbacks  to  such  an  approach.  First,  it 
ignores  the  information  that  may  be  available  from  a  specification.  Second,  it  delays  testing  until  the  implementa¬ 
tion  is  complete,  thereby  not  detecting  errors  in  the  most  timely  and  cost-effective  manner.  Research  efforts  that 
use  symbolic  evaluation  to  assist  in  solving  both  these  problems  are  currently  underway.  Specification-guided 
program  testing  techniques  use  the  information  provided  by  symbolic  evaluation  of  a  specification  to  guide  in  the 
testing  of  its  implementation,  while  specification  testing  techniques  employ  symbolic  evaluation  to  actually  test  a 
specification. 

The  next  section  of  this  chapter  provides  a  brief  overview  of  symbolic  evaluation,  with  an  example  to 
demonstrate  the  method.  The  third  section  describes  a  number  of  ways  in  which  symbolic  evaluation  of  a  pro¬ 
gram  aids  the  path  selection  and  test  data  selection  aspects  of  testing.  The  fourth  section  describes  the  use  of 
symbolic  evaluation  for  specification-guided  program  testing  and  specification  testing. 

[Rich85b]  Abstract:  The  partition  analysis  method  compares  a  procedure’s  implementation  to  its  specification, 
both  to  verify  consistency  between  the  two  and  to  derive  test  data.  Unlike  most  verification  methods,  partition 
analysis  is  applicable  to  a  number  of  different  types  of  specification  languages,  including  both  procedural  and 
nonprocedural  languages.  It  is  thus  applicable  to  high-level  descriptions  as  well  as  to  low-level  designs.  Partition 
analysis  also  improves  upon  existing  testing  criteria.  These  criteria  usually  consider  only  the  implementation,  but 
partition  analysis  selects  test  data  that  exercise  both  a  procedure’s  intended  behavior  (as  described  in  the  specifi¬ 
cations)  and  the  structure  of  its  implementation.  To  accomplish  these  goals,  partition  analysis  divides  or  parti¬ 
tions  a  procedure’s  domain  into  subdomains  in  which  all  elements  of  each  subdomain  are  treated  uniformly  by 
the  specification  and  processed  uniformly  by  the  implementation.  This  partition  divides  the  procedure  domain 
into  more  manageable  units.  Information  related  to  each  subdomain  is  used  to  guide  in  the  selection  of  test  data 
and  to  verify  consistency  between  the  specification  and  the  implementation.  Moreover,  the  testing  and  verifica¬ 
tion  processes  are  designed  to  enhance  each  other.  Initial  experimentation  has  shown  that  through  the  integra¬ 
tion  of  testing  and  verification,  as  well  as  through  the  use  of  information  derived  form  both  the  implementation 
and  the  specification,  the  partition  analysis  method  is  effective  for  evaluating  program  reliability.  This  paper 
describes  the  partition  analysis  method  and  reports  the  results  obtained  from  an  evaluation  of  its  effectiveness. 

[Rich87a]  Abstract:  RELAY,  a  model  for  error  detection,  defines  revealing  conditions  that  guarantee  that  a  fault 
originates  an  error  during  execution  and  that  the  error  transfers  through  computations  and  data  flow  until  it  is 
revealed.  This  model  of  error  detection  provides  a  fault-based  criterion  for  test  data  selection.  The  model  is 
applied  by  choosing  a  fault  classification,  instantiating  the  conditions  for  the  classes  of  faults,  and  applying  them 
to  the  program  being  tested.  Such  an  application  guarantees  the  detection  of  errors  caused  by  any  fault  of  the 
chosen  classes.  As  a  formal  model  of  error  detection,  RELAY  provides  the  basis  for  an  automated  testing  tool. 
This  paper  presents  the  concepts  behind  RELAY,  describes  why  it  is  better  than  other  fault-based  testing  cri¬ 
teria,  and  discusses  how  RELAY  could  be  used  as  the  foundation  for  a  testing  system. 

[Rlch88a]  Abbreviated  Introduction:  This  paper  reports  on  a  new  model  of  error  detection  called  RELAY, 
which  provides  a  fault-based  criterion  for  test  data  selection.  The  RELAY  model  builds  upon  the  testing  theory 
introduced  by  Morell,  where  an  error  is  “created”  when  an  correct  state  is  introduced  at  some  fault  location,  and 
it  is  “propagated”  if  it  persists  to  the  output.  We  refine  this  theory  by  more  precisely  defining  the  notion  of  when 
an  error  is  introduced  and  by  differentiating  between  the  persistence  of  an  error  through  computations  and  its 
persistence  through  data  flow  operations.  We  introduce  similar  concepts,  origination  and  transfer,  as  the  first 
erroneous  evaluation  and  the  persistence  of  that  erroneous  evaluation,  respectively. 
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[Rldd78]  Abstract:  A  modeling  scheme  is  presented  which  provides  a  medium  for  the  rigorous,  formal,  and 
abstract  specification  of  large-scale  software  system  components.  The  scheme  allows  the  description  of  com¬ 
ponent  behavior  without  revealing  or  requiring  the  description  of  a  component’s  internal  operation.  Both  collec¬ 
tions  of  sequential  processes  and  the  data  objects  which  they  share  may  be  described.  The  scheme  is  of  particu¬ 
lar  value  during  the  early  stages  of  software  system  design,  when  the  system’s  modules  are  being  delineated  and 
their  interactions  designed,  and  when  rigorous,  well-defined  specification  of  undesigned  components  allows  for¬ 
mal  and  informal  arguments  concerning  the  design’s  correctness  to  be  formulated. 

[RoacSO]  Abstract:  Software  cost  estimation  techniques  reported  in  recent  literature  are  compared.  Six  cost 
estimation  techniques  are  described  in  a  common  notation.  An  example  is  worked  through  all  the  techniques  to 
illustrate  similarities  and  differences. 

[Roby85]  Abbreviated  Forward:  These  Proceedings  of  the  First  Workshop  on  Formal  Specification  and  Verifica¬ 
tion  of  Ada,  held  at  the  Institute  for  Defense  Analyses,  are  composed  in  part  of  papers  and  slides  supplied  by 
the  speakers,  and  in  part  of  summaries  of  the  talks  and  discussions  edited  from  recordings  made  of  the 
Workshop. 

The  purpose  of  this  initial  two-and-a-half  day  Workshop  was  to  identify  current  issues  in  Ada  verification 
and  to  decide  what  could  be  done  to  improve  current  understanding  and  practice  of  Ada  software  verification. 
Since  verification  impacts  not  only  coding  activities  but  all  development  activities,  it  is  desirable  that  many 
groups  be  kept  informed  about  the  progress  of  these  Workshops. 

[Roe87]  Abstract:  Several  inequalities  are  derived  for  use  in  certifying  function  subroutines  by  means  of  black 
box  testing.  It  is  assumed  that  a  function  is  approximated  by  means  of  a  polynomial  of  limited  degree  on  a  closed 
interval.  These  inequalities  give  upper  bounds  on  the  error  measured  over  a  finite  sample  and  known  properties 
of  the  function. 

[RombM]  Abstract:  This  paper  describes  results  of  a  study  to  develop  maintenance  metrics  based  on  structural 
software  design  characteristics.  The  intent  of  the  study  was  to  define  a  characteristic  metric  set,  suited  to  explain 
and  predict  software  maintenance  behavior.  The  maintenance  aspects  investigated  in  this  study  are  stability  and 
modifiability.  While  stability  addresses  the  average  number  of  modules  affected  per  change  cause,  modifiability 
characterizes  the  ease  with  which  changes  can  be  made  within  each  of  these  modules.  Additional  interest  is  dedi¬ 
cated  to  the  difference  between  characteristic  design  and  implementation  metric  sets,  and  to  the  difference 
between  change  behavior  during  development  and  maintenance.  This  study  examines  the  development  of  six 
software  systems  and  controlled  maintenance  experiments  using  these  systems. 

[Romb87a]  Abstract:  This  paper  describes  a  study  on  the  impact  of  software  structure  on  maintainability  aspects 
such  as  comprehensibility,  locality,  modifiability,  and  reusability  in  a  distributed  system  environment.  The  study 
was  part  of  a  project  at  the  University  of  Kaiserslautern,  West  Germany,  to  design  and  implement  LADY,  a 
LAnguage  for  Distributed  sYstems.  The  study  addressed  the  impact  of  software  structure  from  two  perspectives. 
The  language  designer’s  perspective  was  to  evaluate  the  general  impact  of  the  set  of  structural  concepts  chosen 
for  LADY  on  the  maintainability  of  software  systems  implemented  in  LADY.  The  language  user’s  perspective 
was  to  derive  structural  criteria  (metrics),  measurable  from  LADY  systems,  that  allow  the  explanation  or  predic¬ 
tion  of  the  software  maintenance  behavior.  A  controlled  maintenance  experiment  was  conducted  involving 
twelve  medium-size  distributed  software  systems;  six  of  these  systems  were  implemented  in  LADY,  the  other  six 
systems  in  an  extended  version  of  sequential  Pascal.  The  benefits  of  the  structural  LADY  concepts  were  judged 
based  on  a  comparison  of  the  average  maintenance  behavior  of  the  LADY  systems  and  the  Pascal  systems;  the 
maintenance  metrics  were  derived  by  analyzing  the  interdependence  between  structure  and  maintenance 
behavior  of  each  individual  LADY  system. 

[Rose84]  Abstract:  This  paper  describes  a  methodology  for  the  design  of  a  class  of  Ada  software  tools  which 
perform  source-to- source  transformation  of  Ada  programs.  The  tools  perform  the  transformations  on  the 
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DIANA  representation  of  an  input  source  program  using  a  package  of  templates  which  are  the  DIANA 
representation  of  source  program  textual  insertions. 

Following  a  brief  overview  of  DIANA,  the  environment  required  by  these  tools  is  described;  a  typical 
environment  consists  of  an  implementation  of  DIANA  and  a  set  of  utility  programs.  Next,  the  paper  describes  a 
“skeleton”  program  which  is  used  to  implement  a  tool;  the  tool  skeleton  is  a  recursive  DIANA  tree  traversal 
program  which  is  expanded  incrementally  with  code  to  perform  a  set  of  specific  transformations.  The  next  por¬ 
tion  of  the  paper  gives  a  detailed  description  of  the  design  methodology;  the  methodology  provides  for  mapping  a 
source-level  specification  of  a  transformation  tool  to  a  DLANA-level  specification,  which  serves  as  an  implemen¬ 
tation  guide  for  the  tool.  Finally,  a  description  is  given  of  an  application  of  the  design  methodology,  a  preproces¬ 
sor  for  the  task  monitoring  system  described  by  Helmbold  and  Luckham. 

To  conclude  the  paper,  a  summary  of  the  advantages  and  suggested  applications  of  the  design  methodology 
is  presented.  The  major  advantage  of  the  methodology  is  that  it  allows  the  transformations  performed  by  a  tool 
to  be  implemented  and  tested  incrementally,  making  debugging  less  complex  and  implementation  more  efficient. 

[Rose85a]  Abbreviated  Introduction:  This  article  describes  a  methodology  for  the  design  of  Ada  transformation 
tools  using  the  DIANA  representation  of  Ada  source  program  input.  The  methodology  was  tested  on  the  imple¬ 
mentation  of  the  task  monitor  preprocessor  of  Helmbold  and  Luckham  and  proved  quite  an  effective  way  to 
implement  an  Ada  tool.  A  tools  designed  according  to  the  methodology  requires  an  environment  of  support  pro¬ 
grams  and  packages  for  maintaining  DIANA  trees.  Each  tool  is  an  expanded  version  of  a  very  simple  program 
called  a  tool  skeleton,  which  is  nothing  more  than  a  case  statement  that  recursively  traverses  a  DIANA  tree. 

[Rose85b]  Abstract:  Computer  systems  have  become  an  integral  part  of  most  organizations.  The  need  to  pro¬ 
vide  continuous,  correct  service  is  becoming  more  critical.  However,  decentralization  of  computing,  inexperi¬ 
enced  users,  and  larger  more  complex  systems  make  for  operational  environments  that  make  it  difficult  to  pro¬ 
vide  continuous,  correct  service.  This  document  is  intended  for  the  computer  system  manager  (or  user)  responsi¬ 
ble  for  the  specification,  measurement,  evaluation,  selection  or  management  of  a  computer  system. 

This  report  addresses  the  concepts  and  concerns  associated  with  computer  system  reliability.  Its  main  pur¬ 
pose  is  to  assist  system  managers  in  acquiring  a  basic  understanding  of  computer  system  reliability  and  to  suggest 
actions  and  procedures  which  can  help  them  establish  and  maintain  a  reliability  program.  The  report  presents 
discussions  on  quantifying  reliability  and  assessing  the  quality  of  the  computer  system.  Design  and  implementa¬ 
tion  techniques  that  may  be  used  to  improve  the  reliability  of  the  system  are  also  discussed.  Emphasis  is  placed 
on  understanding  the  need  for  reliability  and  the  elements  and  activities  that  are  involved  in  implementing  a  relia¬ 
bility  program. 

[Ross88]  Abstract:  This  report  discusses  the  role  of  Management  Indicators  in  validating  the  predictive  capabil¬ 
ity  of  the  bottom-up  evaluation  process,  which  is  defined  by  the  Procedural  Approach  to  the  Evaluation  of 
Software  Development  Methodologies.  The  bottom-up  evaluation  process  provides  a  framework  for  determining 
the  extent  to  which  software  engineering  objectives,  e.g.,  reliability  and  maintainability,  are  present  in  a  software 
product  from  a  design  perspective  of  the  code  and  supporting  documentation.  The  bottom-up  evaluation  process 
is  observed  to  be  a  predictor  of  the  extent  to  which  the  objectives  are  realized  in  the  post-developed  product. 

Employment  of  the  bottom-up  evaluation  process  to  determine  the  extent  to  which  the  objectives  are 
present  in  the  product  is  accomplished  by  the  utilization  of  Design  Indicators.  Management  Indicators  are  pro¬ 
posed  as  a  counterpart  to  Design  Indicators  and  enable  one  to  measure  the  extent  to  which  the  objectives  are 
realized  in  a  developed  product.  While  Design  Indicators  focus  on  design  structure  characteristics  of  the  pro¬ 
duct,  Management  Indicators  focus  on  the  acquisitional,  behavioral,  and  maintenance  characteristics.  To 
accomplish  the  validation  of  the  predictive  capability,  the  correlation  between  the  values  obtained  by  utilizing 
Design  Indicators  and  those  obtained  by  utilizing  Management  Indicators  must  be  investigated.  The  author  has 
chosen  to  study  and  present  the  software  engineering  objectives  of  reliability  and  maintainability  as  they  related 
to  a  future  validation  effort. 

[Rowl81a]  Abstract:  The  element  z  is  called  a  transcendental  for  the  class  F  if  functions  in  F  can  be  uniquely 
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identified  by  their  values  at  z.  Conditions  for  the  existence  of  transcendentals  are  discussed  for  certain  classes  of 
polynomials,  and  rational  functions.  Of  particular  interest  are  those  transcendentals  having  an  exact  representa¬ 
tion  in  computer  arithmetic.  Algorithms  are  presented  for  reconstruction  of  the  coefficients  of  a  polynomial 
from  its  value  at  a  transcendental.  The  theory  is  illustrated  by  application  to  polynomials,  quadratic  forms,  and 
quadrature  formulas. 

[Rowl88]  Abstract:  This  paper  describes  techniques  for  the  automatic  generation  of  large  artificial  software  sys¬ 
tems  which  can  be  used  for  laboratory  studies  of  testing  and  integration  strategies,  reliability  models,  and  so 
forth.  A  prototype  generator  is  described  which  produces  code  for  such  systems  by  constructing  a  large  number 
of  nearly  identical  modules.  This  generator  has  been  used  to  construct  a  family  of  systems  which  in  theory  can  be 
made  arbitrarily  large.  Several  experiments  were  conducted  to  explore  the  sensitivity  of  the  Jelinski-Moranda 
model  to  violations  of  the  assumption  that  all  defects  have  equal  probability  of  being  discovered. 

[Rnbe75]  Abstract:  This  paper  discusses  the  need  for  quantitative  descriptions  of  software  errors  and  methods 
for  gathering  such  data.  The  software  development  cycle  is  reviewed,  and  the  frequency  of  the  errors  that  are 
detected  during  software  development  and  independent  validation  are  compared.  Data  obtained  from  validation 
efforts  are  presented,  indicating  the  number  of  errors  in  ten  categories  and  three  severity  levels;  the  inferences 
that  can  be  drawn  from  these  data  are  discussed.  Data  describing  the  effectiveness  of  validation  tools  and  tech¬ 
niques  as  a  function  of  time  are  presented  and  discussed.  The  software  validation  cost  is  contrasted  with  the 
software  development  cost.  The  applications  of  better  quantitative  software  error  data  are  summarized. 

[Rums77]  Abstract:  The  performance  measure  and  analysis  of  software  operating  systems  which  extend  basic 
computing  machinery  is  discussed.  The  description  of  an  external  monitoring  technique  which  facilities  the 
correlation  of  hardware  events  with  software  functions  without  the  need  for  software  monitors  is  presented.  A 
time  related  event  is  defined  to  provide  the  basis  for  the  technique  used  to  implement  the  monitor  system.  In 
addition,  event  analysis  methods  are  introduced  which  allow  a  software  system  execution  profile  to  be  con¬ 
structed. 

[Rust71]  Abbreviated  Introduction:  This  volume  deals  with  efforts  at  control  and  extermination  of  that  notori¬ 
ous  form  of  non-insect  life  which  we  in  the  programming  community  refer  to,  somewhat  contemptuously,  as 
“bugs.”  Although  as  individuals  we  may  in  less  cautious  moments  speak  of  bugs  with  cavalier  disdain,  it  is  always 
with  a  latent  awareness  that  such  bravado  may  be  the  harbinger  of  a  period  of  intense  bug-hunting,  relieved  only 
by  occasional  naps  on  piles  of  discarded  dumps.  To  the  bug-plagued  victim,  the  sympathetic  nods  of  one’s  col¬ 
leagues  more  often  suggest  relief  that  it  is  “him  rather  than  me.” 

The  more  fatalistic  among  us  may  find  such  a  period  good  for  the  soul;  a  penance  for  the  general  malfea¬ 
sance  of  those  involved  in  activity  in  which  a  quantity  of  intellectual  self-indulgence  is  tolerated.  Of  course,  even 
given  the  frustration  of  the  exterminating  effort,  there  is  the  pleasure  in  locating  and  ridding  a  program  of  the 
infecting  source.  The  gratification  of  discovery  could  only  be  enhanced  at  finding  the  bug  was  someone  else’s. 

[SDI087]  Overview:  The  Strategic  Defense  Initiative  Organization  (SDIO)  Test  and  Evaluation  Master  Plan 
(TEMP)  outlines  the  planning  and  management  of  test  and  evaluation  activities  for  the  Strategic  Defense  System. 
It  is  an  evolving  document.  Detailed  planning,  and  results  from,  individual  test  and  evaluation  activities  will  be 
included  as  the  software  effort  proceeds. 

[SDIOMa]  Overview:  The  Strategic  Defense  Initiative  Organization  (SDIO)  Software  Policy  requires  the  use  of 
promising  software  engineering  approaches  for  the  development  and  evolution  of  all  full  scale  development  Stra¬ 
tegic  Defense  System  (SDS)  software.  To  ensure  that  all  SDS  mission-critical  software  exhibits  the  necessary  lev¬ 
els  of  quality,  software  efforts  are  required  to  address  requirements  of  software  reliability,  security,  interoperabil¬ 
ity,  portability,  maintainability,  and  usability  throughout  the  system  life  cycle. 

The  policy  is  restricted  to  identifying  requirements  for  software  engineering  practices.  The  services,  and 
other  implementing  agents,  will  develop  their  own  implementation  documents  that  are  consistent  with  their 


332 


# 


August  9, 1989 


existing  or  planned  software  engineering  management  practices. 

[SDIOMb]  Overview:  This  Strategic  Defense  Initiative  Organization  (SDIO)  Management  Directive  specifies 
the  implementation  of  the  SDIO  Software  Policy  [SDI088a]  required  on  all  software  efforts  sponsored  directly 
by  the  SDIO.  In  addition  to  specifying  how  the  Software  Policy  must  be  reflected  in  requests  for  proposals,  and 
other  contracting  documents,  the  management  directive  explicitly  enumerates  those  conditions  under  which 
requests  for  waivers  to  the  Software  Policy  will  be  accepted. 

[SERC87]  Abstract:  Mutation  analysis  is  a  software  testing  technique  that  measures  test  data  adequacy  that  is, 
the  ability  of  test  data  to  ensure  that  certain  errors  are  not  present  in  the  program  under  test.  Mothra  is  a 
software  testing  environment  built  on  the  mutation  analysis  approach  to  determining  test  effectiveness.  It  con¬ 
sists  of  an  integrated  set  of  tools  that  allow  the  user  to  interactively  test  Fortran-77  software  throughout  the 
software  development  cycle.  Mothra  currently  runs  under  4.3  BSD  UNIX,  System  V  UNIX,  and  ULTRIX  32 
1.2. 

This  document  is  a  user’s  manual  for  first  time  users  of  Mothra  as  well  as  a  reference  manual  for  more 
experienced  users.  The  manual  describes  the  function  both  of  the  tools  that  comprise  Mothra  and  of  cdemo,  a 
simple  interface  that  was  designed  to  facilitate  the  use  of  these  tools.  The  first  section  provides  some  background 
information  and  an  explanation  of  the  steps  involved  in  using  Mothra  to  test  software.  Readers  wishing  more 
detailed  information  on  mutation  analysis  should  consult  the  bibliography.  The  second  section  describes  the 
specifics  of  cdemo  itself.  Examples  of  software  testing  with  Mothra  are  presented  throughout  the  document. 

[Sack68]  Abstract:  Two  exploratory  experiments  were  conducted  at  System  Development  Corporation  to  com¬ 
pare  debugging  performance  of  programmers  working  under  conditions  of  online  and  offline  access  to  a  com¬ 
puter.  These  are  the  first  known  studies  that  measure  programmers’  performance  under  controlled  conditions  for 
standard  tasks. 

Statistically  significant  results  of  both  experiments  indicated  faster  debugging  under  online  conditions,  but 
perhaps  the  most  important  practical  finding  involves  the  striking  individual  differences  in  programmer  perfor¬ 
mance.  Methodological  problems  encountered  in  designing  and  conducting  these  experiments  are  described; 
limitations  of  the  findings  are  pointed  out;  hypotheses  are  presented  to  account  for  results;  and  suggestions  are 
made  for  further  research. 

[Sahn87]  Abstract:  A  graph-based  modeling  technique  has  been  developed  for  the  stochastic  analysis  of  sys¬ 
tems  containing  concurrency.  The  basis  of  the  technique  is  the  use  of  directed  acyclic  graphs.  These  graphs 
represent  event-precedence  networks  where  activities  may  occur  serially,  probabilistically,  or  concurrently.  When 
a  set  of  activities  occur  concurrently,  the  condition  for  the  set  of  activities  to  complete  is  that  a  specified  number 
of  the  activities  must  complete.  This  includes  the  special  cases  that  one  or  all  of  the  activities  must  complete. 
The  cumulative  distribution  function  associated  with  an  activity  is  assumed  to  have  exponential  polynomial  form. 
Further  generality  is  obtained  by  allowing  these  distributions  to  have  a  mass  at  the  origin  and/or  at  infinity.  The 
distribution  function  for  the  time  taken  to  complete  the  entire  graph  is  computed  symbolically  in  the  time  param¬ 
eter  t.  The  technique  allows  two  or  more  graphs  to  be  combined  hierarchically.  Applications  of  the  technique  to 
the  evaluation  of  concurrent  program  execution  time  and  to  the  reliability  analysis  of  fault-tolerant  systems  are 
discussed. 

[Salt82]  Abstract:  Although  the  Software  Science  metrics  originally  proposed  by  Halstead  are  appealing,  calcu¬ 
lation  of  the  metrics  depends  on  the  existence  of  well-defined  counting  strategies.  The  strategies  require  precise 
definitions  of  operators  and  operands.  It  is  important  that  the  strategies  employed  be  described  in  research 
papers.  Furthermore,  the  presentation  of  helpful  examples  of  the  application  of  the  strategies  is  recommended. 
Good  descriptions  do  not  imply  correct  strategies,  but  they  do  ensure  that  the  strategies  can  be  understood, 
tested,  and  evaluated.  Appendices  to  this  paper  provide  the  description  of  a  Pascal  counting  strategy  and  an 
example  of  applying  the  strategy. 
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[Sam«76]  Abstract:  A  method  for  compiler  testing  using  symbolic  interpretation  is  presented.  This  method  is  a 
cross  between  program  proving  and  program  testing.  It  is  useful  in  demonstrating  that  programs  are  correctly 
translated  from  a  high  level  language  to  a  low  level  language  thereby  improving  the  reliability  of  the  compiler.  The 
term  symbolic  interpretation  is  used  to  describe  the  process  of  obtaining  an  intermediate  form  of  the  low  level 
language  program  that  is  suitable  for  further  processing  by  a  proof  system.  Symbolic  interpretation  is  the  heart  of 
the  system  and  enables  the  recording  of  a  transcript  of  all  computations  in  the  program.  This  process  interprets  a 
set  of  procedures  which  describe  the  effects  of  machine  language  instructions  corresponding  to  the  target 
machine  on  a  suitable  computation  model.  The  highlights  and  limitations  of  the  process  as  well  as  future  work 
are  discussed  in  a  framework  of  a  specific  LISP  implementation  on  a  PDP-10  computer. 

[Sank85]  Abstract:  Anna  is  a  language  extension  of  Ada  to  include  facilities  for  formally  specifying  the  intended 
behavior  of  Ada  programs.  It  augments  Ada  with  precise  machine-processable  annotations  so  that  well  esta¬ 
blished  formal  methods  of  specification  and  documentation  can  be  applied  to  Ada  programs. 

This  paper  describes  an  implementation  of  a  subset  of  Anna.  The  implementation  is  a  transformer  that 
accepts  as  input  an  Anna  parse  tree  and  produces  as  output  an  equivalent  Ada  parse  tree  that  contains  the  neces¬ 
sary  executable  runtime  checks  for  the  Anna  specifications.  An  approach  called  the  Checking  Function 
Approach  is  used.  This  involves  the  generation  of  a  function  for  each  annotation  and  generating  calls  to  these 
functions  at  appropriate  places.  The  transformer  has  to  take  care  of  various  details  like  hiding,  overloading, 
nesting,  etc. 

It  is  hoped  that  the  transformer  will  eventually  cover  most  of  Anna  and  have  various  features  like  a  good 
user  interface,  interaction  with  a  symbolic  debugger,  and  optimization  of  runtime  checks  for  permanent  inclu¬ 
sion. 

[Sari84b]  Abstract:  Protocol  testing  for  the  purpose  of  certifying  the  implementation’s  adherence  to  the  proto¬ 
col  specification  can  be  done  with  a  test  architecture  consisting  of  remote  tester  and  local  responder  processes 
generating  specific  input  stimuli,  called  test  sequences,  and  observing  the  output  produced  by  the  implementa¬ 
tion  under  test.  It  is  possible  to  adapt  test  sequence  generation  techniques  for  finite  state  machines,  such  as  tran¬ 
sition  tour,  characterization,  and  checking  sequence  methods,  to  generate  test  sequences  for  protocols  specified 
as  incomplete  finite  state  machines.  For  certain  test  sequences,  the  tester  or  responder  processes  are  forced  to 
consider  the  timing  of  an  interaction  in  which  they  have  not  taken  part;  these  test  sequences  are  called  nonsyn- 
chronizable.  The  three  test  sequence  generation  algorithms  are  modified  to  obtain  synchronizable  test 
sequences.  The  checking  of  a  given  protocol  for  intrinsic  synchronization  problems  is  also  discussed.  Complexi¬ 
ties  of  synchronizable  test  sequence  generation  algorithm  are  given  and  complete  testing  of  a  protocol  is  shown 
to  be  infeasible. 

To  extend  the  applicability  of  the  characterization  and  checking  sequences,  different  methods  are  pro¬ 
posed  to  enhance  the  protocol  specifications:  special  test  input  interactions  are  defined  and  a  methodology  is 
developed  to  complete  the  protocol  specifications. 

[Sari87]  Abstract:  Communication  protocol  testing  can  be  done  with  a  test  architecture  consisting  of  remote 
Lower  Tester  and  local  Upper  Tester  processes.  For  real  protocols,  tests  can  be  designed  based  on  the  formal 
specification  of  the  protocol  which  uses  an  extended  finite  state  machine  model.  The  specification  is  transformed 
into  a  simpler  form  consisting  of  normal  form  transitions.  It  can  then  be  modeled  by  a  control  and  a  data  flow 
graph.  The  graphs  are  decomposed  into  subtours  and  data  flow  functions,  respectively.  Tests  are  designed  by  con¬ 
sidering  parameter  variations  of  the  input  primitives  of  each  data  flow  function  and  determining  the  expected 
outputs.  The  methodology  gives  complete  test  coverage  of  all  data  flow  functions  and  control  paths  in  the  specifi¬ 
cation.  Functional  fault  models  are  proposed  for  functions  that  are  not  formally  specified. 

[Sari88a]  Abstract:  With  wide-spread  acceptance  of  the  ISO-OSI  reference  model  and  its  standardized  proto¬ 
cols  in  the  areas  of  computer  communication  and  information  exchange,  various  types  of  protocol  testing  [have] 
become  an  area  of  active  research  and  development.  This  paper  surveys  recent  developments  in  protocol  valida¬ 
tion.  The  discussion  includes  two  important  components  any  protocol  test  system  must  have:  test  sequence 
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generator  and  trace  checker  as  well  as  protocol  verification  techniques. 

[Sarkft9]  Abstract:  At  the  heart  of  any  program  verifier  lies  a  theorem  prover  which  proves  theorems  over  the 
domain  of  the  program.  For  any  meaningful  program,  the  theorems  encountered  are  quite  complex.  The  prob¬ 
lem,  which  is  equivalent  to  the  validity  problem  of  second-order  logic,  reduces  to  that  of  first-order  logic  when 
the  assertions  of  the  program  are  available.  In  both  the  cases,  the  problem  remains  undecidable  and  human 
intervention  at  some  stage  or  other  becomes  essential.  Resolution-based  theorem  provers  proposed  for  first- 
order  logic  are  very  popular  because  they  allow  easy  human  intervention.  However,  the  theorems  encountered  in 
proving  programs  do  not  follow  the  exact  syntax  of  predicate  calculus;  rather,  they  are  obtained  in  more  popular 
algebraic  notation.  Thus,  the  inference  rules  available  in  first-order  logic  are  not  directly  applicable  to  the  verifi¬ 
cation  conditions  of  the  paths  of  the  program. 

In  the  present  paper  significant  modifications  of  the  first-order  rules  have  been  developed  so  that  they 
apply  directly  to  the  algebraic  expressions.  The  importance  and  implication  of  normalization  of  formulas  in  any 
theorem  prover  have  been  discussed.  It  has  also  been  shown  how  the  properties  of  the  domain  of  discourse  have 
been  taken  care  of  either  by  the  normalizer  or  by  the  inference  rules  proposed.  Through  a  nontrivial  example  the 
following  capabilities  of  the  verifier,  which  would  use  these  inference  rules,  have  been  highlighted:  1)  closeness 
of  the  proof  construction  process  to  human  thought  process  and  2)  efficient  handling  of  user  provided  axioms; 
such  capabilities  make  the  interfacing  with  human  element  easy. 

[Satt72]  Summary:  The  design  of  an  integrated  programming  and  debugging  system  using  the  language  ALGOL 
W  is  described.  The  debugging  tools  are  based  entirely  upon  the  source  language  but  can  be  efficiently  imple¬ 
mented.  The  most  novel  such  tool  is  a  selective  trace,  automatically  controlled  by  execution  frequency  counts. 
System  performance  information  is  included. 

[Scha79]  Abstract:  This  report  presents  the  results  of  a  study  and  investigation  of  software  reliability  models.  In 
particular,  the  purpose  was  to  investigate  the  statistical  properties  of  selected  software  reliability  models,  includ¬ 
ing  the  statistical  properties  of  the  parameter  estimates,  and  to  investigate  the  goodness  fit  of  the  models  to 
actual  software  error  data.  The  results  indicate  that  the  models  fit  poorly,  generally  due  to  in  most  part  the 
vagaries  of  the  data  rather  than  shortcomings  of  the  models. 

[Schl78]  Abstract:  This  paper  examines  the  most  widely  used  reliability  models.  The  models  discussed  fall  into 
two  categories,  the  data  domain  and  the  time  domain.  Besides  tracing  the  historical  development  of  the  various 
models  their  advantages  and  disadvantages  are  analyzed.  This  includes  models  based  on  discrete  as  well  as  con¬ 
tinuous  probability  distributions.  How  well  a  given  model  performs  its  purpose  in  a  specific  economic  environ¬ 
ment  will  determine  the  usefulness  of  the  model.  Each  of  the  models  is  examined  with  actual  data  as  to  the  appli¬ 
cability  of  the  error  finding  process. 

[Schn75]  Abstract:  A  non-homogeneous  poisson  process  is  used  to  model  the  occurrence  of  errors  detected  dur¬ 
ing  functional  testing  of  command  and  control  software.  The  parameters  of  the  detection  process  are  estimated 
by  using  a  combination  of  maximum  likelihood  and  weighted  least  squares  methods.  Once  parameter  estimates 
are  obtained,  forecasts  can  be  made  of  cumulative  number  of  detected  errors.  Forecasting  equations  of  cumula¬ 
tive  corrected  errors,  errors  detected  but  not  corrected,  and  the  time  required  to  detect  or  correct  a  specified 
number  of  errors,  are  derived  from  the  detected  error  function.  The  various  forecasts  provide  decision  aids  for 
managing  software  testing  activities.  Naval  tactical  data  system  software  error  data  are  used  to  evaluate  several 
variations  of  the  forecasting  methodology  and  to  test  the  accuracy  of  the  forecasting  equations. 

[Schn77b]  Abbreviated  Introduction:  The  significance  of  program  structural  characteristics  has  been  recog¬ 
nized  for  some  time,  as  witnessed  by  the  emergence  of  structured  programming.  But  there  is  another  tool  avail¬ 
able  that  has  usually  been  overlooked  in  the  software  development  process:  simulation. 

Simulation  is  relatively  new  to  the  evaluation  and  measurement  of  software —  even  though  examples 
abound  of  simulation  and  analytical  models  that  have  been  developed  for  modeling  software  error  detection. 
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This  paper  attempts  to  show  how  simulation  can  be  used  both  to  evaluate  alternatives  during  design  and  to  simu¬ 
late  the  detection  of  errors  during  testing. 

To  improve  program  quality  we  must  not  only  avoid  errors  during  program  design;  we  must  also  detect 
them  during  testing.  Hence,  one  of  the  characteristics  of  a  good  design  is  a  program  structure  that  allows  easy 
error  detection. 

A  convenient  way  of  describing  program  structure  and  simulating  the  detection  of  errors  is  to  represent 
the  program  in  a  directed  graph.  By  using  a  directed  graph  to  represent  the  structure  of  a  program  and  simulation 
to  study  program  error  detection,  the  following  information  can  be  obtained: 

1.  Error  detection  (number  or  fraction  of  errors  detected)  as  a  function  of  a  program’s  structural  characteristics, 
for  a  given  number  of  tests.  The  test  consists  of  beginning  simulated  program  execution  at  the  start  node, 
detecting  and  correcting  any  errors,  restarting  at  the  start  node,  and  repeating  this  process  until  a  terminal  node 
is  reached. 

2.  Error  detection  as  a  function  of  number  of  tests  for  given  structural  characteristics. 

Structural  characteristics  correspond  to  program  characteristics.  For  example,  numbers  of  nodes,  arcs, 
paths  and  source  statements  correspond  to  bn  iching  and  merging,  arithmetic  and  data  transfer  operations,  exe¬ 
cution  sequences,  and  size. 

[Schn77c]  Abstract:  Program  structure  and  modularity  are  important  considerations  for  the  development  of  reli¬ 
able  software.  Most  software  specialists  agree  that  higher  reliability  is  achieved  when  software  systems  are  highly 
modularized  and  module  structure  is  kept  simple.  However  if  this  principle  is  carried  too  far  in  the  design  of 
large  systems,  lower  rather  than  higher  reliability  may  result.  This  may  occur  because  the  added  complexity  of  a 
large  number  of  communication  paths  among  a  large  number  of  small  modules  may  exceed  the  reduction  in  com¬ 
plexity  of  individual  modules.  Real  time  operating  system  structures  are  examined  in  terms  of  their  modularity 
characteristics.  Proposals  are  advanced  for  improving  the  structure  of  real  time  operating  systems. 

[Schn79a]  Abstract:  The  propensity  to  make  programming  errors  and  the  rates  of  error  detection  and  correction 
are  dependent  on  program  complexity.  Knowledge  of  these  relationships  can  be  used  to  avoid  error  prone  struc¬ 
tures  in  software  design  and  to  devise  a  testing  strategy  which  is  based  on  anticipated  difficulty  of  error  detection 
and  correction.  An  experiment  in  software  error  data  collection  and  analysis  was  conducted  in  order  to  study 
these  relationships  under  conditions  where  the  error  data  could  be  carefully  defined  and  collected.  Several  com¬ 
plexity  measures  which  can  be  defined  in  terms  of  the  directed  graph  representation  of  a  program,  such  as 
cyclomatic  number,  were  analyzed  with  respect  to  the  following  error  characteristics:  errors  found,  time  between 
error  detections,  and  error  correction  time.  Significant  relationships  were  found  between  complexity  measures 
and  error  characteristics.  The  meaning  of  directed  graph  structural  properties  in  terms  of  the  complexity  of  the 
programming  and  testing  tasks  was  examined. 

[Schn79b]  Introduction:  Computer  program  graphs  have  proven  very  useful  because  they  eliminate  the  struc¬ 
tural  characteristics  of  a  program.  Structural  characteristics,  as  a  representation  of  program  complexity,  have 
been  shown  to  be  strongly  related  to  program  development  time,  program  quality  and  difficulty  of  debugging. 
The  use  of  graphs  for  these  purposes  is  not  widely  known  or  understood  in  the  data  processing  community.  It  is 
the  aim  of  this  paper  to  provide  an  introduction  to  graphs  as  they  apply  to  program  representation  and  to  show 
examples  of  their  use  in  program  design  and  debugging. 

[Schr84]  Abstract:  This  paper  describes  an  attempt  to  integrate  the  collection  and  the  efficient  utilization  of 
measurements  in  the  development  and  the  use  of  programs.  The  work  presented  consists  in  three  parts: 

•  the  design  of  both  static  and  dynamic  measurement  tools, 

•  examples  of  data  processing  on  measurements  collected  on  a  sample  of  Pascal  programs, 

•  the  design  of  a  quantitative  documentation  of  a  program,  which  is  automatically  built  as  measurements  are 
collected. 

The  first  and  third  steps  have  been  developed  inside  an  existing  programming  environment,  Mentor,  and 
we  shall  discuss  the  advantages  we  found  in  integrating  the  tools  in  such  an  environment. 
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[Schu81]  Abstract:  This  paper  addresses  the  problem  of  programming  distributed  systems  within  the  framework 
of  the  Ada  language,  which  provides  primitives  for  interprocess  communication  based  upon  the  model  of  Com¬ 
municating  Sequential  processes.  We  first  discuss  our  basic  assumptions  concerning  the  underlying  target  confi¬ 
guration,  the  physical  communication  medium  which  is  to  support  that  application  and  pattern  of  the  logical 
communication  within  the  application  proper.  We  then  develop  a  first  approach  for  constructing  such  applica¬ 
tions  using  the  separate  compilation  facilities  of  Ada.  Finally,  we  consider  two  possible  protocols  for  implement¬ 
ing  the  requisite  distributed  interprocess  communication,  referred  to  as  the  Remote  Entry  Call  and  the  Remote 
Procedure  Call,  respectively. 

[Schw70a]  Overview:  A  survey  of  the  type,  frequency,  and  habitat  of  bugs  is  outlined.  Debugging  tools  presently 
available  are  discussed  and  suggestions  for  their  development  advanced.  The  role  of  “proofs  of  program  correct¬ 
ness”  and  the  debugging  process  itself  are  discussed. 

[Scot84a]  Abstract:  New  data  domain  reliability  models  have  been  developed  for  the  N-version,  Recovery  Block 
and  Consensus  Recovery  Block  approaches  to  fault-tolerant  software  and  investigation  of  the  validity  of  each  of 
these  models  is  underway.  Central  to  validation  is  the  underlying  dependence  of  the  multiple  versions  of  software 
modules  required  by  these  approaches  and  the  impact  of  this  dependence  on  reliability  predictions.  This  paper 
presents  reliability  models  for  all  three  fault-tolerance  approaches  using  assumptions  of  both  independence  and 
dependence.  The  presentation  of  the  experimental  investigation  focuses  in  the  Recovery  Block  strategy.  The 
results  can  be  summarized  by  saying  the  models  relying  on  the  assumption  of  module  independence  did  not  ade¬ 
quately  predict  reliability  on  the  experiments.  The  dependent  models  were  successful.  Furthermore,  the  underly¬ 
ing  dependence  could  not  be  attributed  to  common  cause  errors  resulting  from  similarities  in  the  solution  algo¬ 
rithms.  Rather,  the  dependence  was  attributable  to  the  difficulty  of  the  input  test  cases. 

[Scot84b]  Abstract:  Results  are  presented  for  an  experiment  conducted  at  North  Carolina  State  University  to 
validate  the  author’s  fault-tolerant  software  reliability  models.  Both  independent  and  dependent  versions  of  the 
Recovery  Block,  N-Version  Programming,  and  Consensus  Recovery  Block  reliability  models  were  studied.  It 
was  shown  that  the  assumption  of  version  independence  leads  to  poor  predictions  of  reliability.  The  reliability 
gains  offered  by  each  of  the  three  methods  of  software  fault-tolerance  were  also  compared. 

[Scot87]  Abstract:  In  situations  in  which  computers  are  used  to  manage  life-critical  situations,  software  errors 
that  could  arise  due  to  inadequate  or  incomplete  testing  cannot  be  tolerated.  This  paper  examines  three  methods 
of  creating  fault-tolerant  software  systems,  Recovery  Block,  N-Version  Programming,  and  Consensus  Recovery 
Block,  and  it  presents  reliability  models  for  each.  The  models  are  used  to  show  that  one  method,  the  Consensus 
Recovery  Block,  is  more  reliable  than  the  other  two. 

The  results  of  an  experiment  used  to  validate  the  models  ?re  presented.  It  is  demonstrated  that,  for  highly 
reliable  acceptance  tests,  the  Consensus  Recovery  Block  system  gave  the  highest  reliability.  In  all  cases,  the  Con¬ 
sensus  Recovery  Block  and  Recovery  Block  systems  were  better  than  the  N-Version  Programming  systems. 

A  simple  cost  model  that  shows  the  relative  costs  of  increasing  software  reliability  using  the  three  fault- 
tolerant  methods  is  presented. 

[Sedl83]  Abstract:  Fault  localization  in  program  debugging  is  the  process  of  identifying  program  statements 
which  cause  anomalous  behavior.  We  have  developed  a  prototype,  knowledge-based  model  of  the  fault  localiza¬ 
tion  process.  Novel  features  of  the  model  include  multiple  localization  tactics  and  a  recognition-based  mechan¬ 
ism  for  program  abstraction.  An  explicit  division  of  knowledge  from  the  applications,  programming  and 
language  domains  facilitate  model  tuning  within  as  well  as  across  applications  domains.  We  describe  model  struc¬ 
ture  and  performance  for  a  class  of  faults  associated  with  master  file  update  programs.  We  foresee  applications 
of  the  model  as  an  initial  cognitive  theory  of  expertise  in  fault  localization  and  as  a  partially  automated  debugging 
tool. 

[Se!b85]  Abbreviated  Abstract:  The  evaluation  of  software  technologies  suffers  because  of  the  lack  of 
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quantitative  assessment  of  their  effect  on  software  development  and  modification.  A  seven-step  approach  for 
quantitatively  evaluating  software  technologies  couples  software  methodology  evaluation  with  software  measure¬ 
ment.  The  approach  is  applied  in-depth  in  the  following  three  areas.  1)  Software  Testing  Strategies:  A  74-subject 
study,  including  32  professional  programmers  and  42  advanced  university  students,  compared  code  reading, 
functional  testing,  and  structural  testing  in  a  fractional  factorial  design.  2)  CLEANROOM  Software  Develop¬ 
ment:  Fifteen  three-person  teams  separately  built  a  1200-line  message  system  to  compare  CLEANROOM 
software  development  (in  which  software  is  developed  completely  off-line)  with  a  more  traditional  approach.  3) 
Characteristic  Software  Metric  Sets:  In  the  NASA  SEL  production  environment,  a  study  of  65  candidate  pro¬ 
duct  and  process  measures  of  652  modules  from  six  (51, OCX)  - 112,000  line)  projects  yielded  a  characteristic  set  of 
software  cost/quality  metrics. 

[Selb86]  Abstract:  This  study  compares  the  three  testing  strategies  of  (1)  code  reading  by  stepwise  abstraction, 
(2)  functional  testing  using  equivalence  partitioning  and  boundary  value  analysis,  and  (3)  structural  testing  with 
100%  statement  coverage  criteria  -  and  the  six  pairwise  combinations  of  these  techniques.  Thirty  two  profes¬ 
sional  programmers  applied  the  techniques  to  three  unit-sized  programs  in  a  fractional  factorial  experimental 
design. 

The  major  results  of  this  study  are  the  following. 

1.  The  six  combined  testing  approaches  detected  17.7%  more  of  the  program’  faults  on  the  average  than  did  the 
three  single  techniques,  which  was  a  35.5%  improvement  in  fault  detection. 

2.  The  highest  percentage  of  the  programs’  faults  were  detected  when  there  was  a  combination  of  either  two  code 
readers  or  a  code  reader  and  a  functional  tester.  However,  a  pairing  of  two  code  readers  detected  more  faults 
per  hour  than  did  a  pairing  of  a  code  reader  and  a  functional  tester. 

3.  The  pairing  of  two  individuals  of  advanced  expertise  resulted  in  the  highest  percentage  of  faults  being 
detected. 

4.  The  most  cost-effective  (number  of  faults  detected  per  hour)  testing  approach  overall  was  when  code  reading 
was  applied  by  an  individual.  The  most  cost-effective  combined  testing  approach  was  when  a  code  reader  was 
paired  with  either  another  code  reader  or  a  structural  tester. 

5.  Both  the  percentage  of  faults  detected  and  the  fault  detection  cost-effectiveness  depended  on  the  type  of 
software  being  tested. 

[Selb87a]  Abstract:  Software  metrics  have  been  useful  to  measure,  evaluate,  and  control  the  software  develop¬ 
ment  process  and  evolving  software  product.  Software  environments  provide  software  tools  and  infrastructure 
to  support  a  variety  of  activities  related  to  software  development.  This  paper  proposes  23  guidelines  for  incor¬ 
porating  metrics  into  software  environments.  The  guidelines  are  organized  into  five  areas:  the  purpose,  type, 
scope,  collection,  and  analysis  of  metrics.  An  example  application  of  the  guidelines  in  a  software  environment 
project  is  described  briefly. 

[Se!b87b]  Abstract:  The  CLEANROOM  software  development  approach  is  intended  to  produce  highly  reliable 
software  by  integrating  formal  methods  for  specification  and  design,  nonexecution-based  program  development, 
and  statistically  based  independent  testing.  In  an  empirical  study,  15  three-person  teams  developed  versions  of 
the  same  software  system  (800-2300  source  lines);  ten  teams  applied  CLEANROOM,  while  five  applied  a  more 
traditional  approach.  This  analysis  characterizes  the  effect  of  CLEANROOM  on  the  delivered  product,  the 
software  development  process,  and  the  developers. 

The  major  results  of  this  study  are  the  following. 

1.  Most  of  the  developers  were  able  to  apply  the  techniques  of  CLEANROOM  effectively  (six  of  the  ten 
CLEANROOM  teams  delivered  at  least  91%  of  the  required  system  functions). 

2.  The  CLEANROOM  teams  products  met  system  requirements  more  completely  and  had  a  higher  percentage 
of  successful  operationally  generated  test  cases. 

3.  The  source  code  developed  using  CLEANROOM  had  more  comments  and  less  dense  control-flow  complex¬ 
ity. 
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4.  The  more  successful  CLEANROOM  developers  modified  their  use  of  the  implementation  language;  they  used 
more  procedure  calls  and  OF  statements,  used  fewer  CASE  statements  and  WHOLE  statements,  and  had  a 
lower  frequency  of  variable  reuse  (average  number  of  occurrences  per  variable). 

5.  All  ten  CLEANROOM  teams  made  all  of  their  scheduled  intermediate  product  deliveries,  while  only  two  of 
the  five  non-CLEANROOM  teams  did. 

6.  Although  86%  of  the  CLEANROOM  developers  indicated  that  they  missed  the  satisfaction  of  program  execu¬ 
tion  to  some  extent,  this  had  no  relation  to  the  product  quality  measures  of  implementation  completeness  and 
successful  operational  tests. 

7.  81%  of  the  CLEANROOM  developers  said  that  they  would  use  the  approach  again. 

[Selb88a]  Abstract:  One  central  feature  of  the  structure  of  a  software  system  is  the  coupling  among  its  com¬ 
ponents  (e.g.,  subsystems,  modules)  and  the  cohesion  within  them.  The  purpose  of  this  study  is  to  quantify  ratios 
of  coupling  and  cohesion  and  use  them  in  the  generation  of  hierarchical  system  descriptions.  The  ability  of  the 
hierarchical  descriptions  to  localize  errors  by  identifying  error-prone  system  structure  is  evaluated  using  actual 
error  data.  Measures  of  data  interaction,  called  data  bindings,  are  used  as  the  basis  for  calculating  software  cou¬ 
pling  and  cohesion.  A  135,000  source  line  system  from  a  production  environment  has  been  selected  for  empirical 
analysis.  Software  error  data  was  collected  from  high-level  system  design  through  system  test  and  from  some 
field  operation  of  the  system.  A  set  of  five  tools  is  applied  to  calculate  the  data  bindings  automatically,  and  clus¬ 
ter  analysis  is  used  to  determine  a  hierarchical  description  of  each  of  the  system’s  77  subsystems.  An  analysis  of 
variance  model  is  used  to  characterize  subsystems  and  individual  routines  that  had  either  many/few  errors  or 
high/low  error  correction  effort. 

[Shan80]  Abstract:  The  main  intent  of  this  paper  is  to  derive  expressions  for  software  performance  prediction 
using  a  state-dependent  error  occurrence-rate  model.  Using  a  Markov  process  representation  for  the  remaining 
number  of  errors  in  the  software  system  we  derive  a  set  of  linear  difference-differential  equations  for  the  proba¬ 
bility  distribution  of  the  number  of  remaining  errors  at  an  arbitrary  time  t.  Solving  this  set  of  equations  we  obtain 
a  binomial  distribution  for  the  number  of  remaining  errors.  We  also  obtain  the  relevant  system  performance 
measures  for  the  software  system.  This  analysis  is  first  carried  out  assuming  that  the  initial  error  content  at  the 
time  t-0  is  a  fixed  unknown  constant  and  subsequently  extend  it  for  the  case  in  which  the  initial  error  content  is  a 
random  variable.  Using  these  results  we  exhibit  an  interesting  insensitivity  characteristic  of  this  model. 

[Shan81]  Abstract:  In  this  paper,  assuming  a  state-  and  time-dependent  software  failure  rate  and  imperfect 
debuggings,  we  develop  a  simple  binomial  model  for  software  error  occurrences.  Maximum  likelihood  estimates 
for  the  required  parameters  of  this  model  are  also  derived.  It  is  established  that  the  Jelinski-Moranda,  imperfect 
debugging  and  non-homogeneous  Poisson  process  models  are  all  special  cases  of  ours. 

[Shan82]  Abstract:  The  purpose  of  this  paper  is  to  develop  a  method  for  designing  and  verifying  data  abstrac¬ 
tions  using  the  functional  approach.  Before  doing  so,  the  existing  techniques  for  designing  and  verifying  pro¬ 
cedure  and  data  abstractions  will  be  surveyed  briefly.  These  techniques  will  then  be  modified  and  extended  to  ver¬ 
ify  data  abstractions.  By  using  the  concept  of  a  mathematical  function,  one  can  model  the  behavior  of  a  pro¬ 
cedure  abstraction  and  give  a  more  uniform  and  clearer  meaning  to  the  stepwise  refinement  and  verification  of 
procedure  abstractions.  The  concept  of  a  state  machine  is  then  used  as  a  basis  to  specify  data  abstractions.  Using 
state  machine  specification,  a  technique  for  expressing  the  design  of  a  data  abstraction  is  then  given.  A  method  is 
then  developed  to  verify  the  design  of  a  data  abstraction  with  respect  to  its  specifications. 

[Shat88]  Abstract:  In  order  to  understand  and  analyze  real-time  distributed  programs,  one  must  account  for 
interactions  between  processes.  Unfortunately,  these  interactions  can  be  quite  complex  due  to  concurrency  and 
nondeterminism.  This  paper  describes  a  framework  for  automated  static  analysis  of  distributed  programs  written 
in  Ada.  The  analysis  is  aimed  at  discovery  of  a  program’s  potential  tasking  behavior,  that  is,  behavior  in  terms  of 
tasking-related  issues.  Central  to  the  framework  is  the  translation  of  a  program  into  an  abstract  grammar  system 
that  represents  a  Petri  net  graph  model. 
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[Shaw78]  Abstract:  Flow  expressions  describe  sequential  and  concurrent  flows  of  entities,  such  as  control,  mes¬ 
sages,  commands,  jobs,  and  resources,  through  system  software  components,  such  as  programs,  procedures, 
modules,  and  processes.  They  consist  of  regular  expressions  extended  with  cyclic  and  interleaving  operators  and 
a  synchronization  facility.  The  language  of  flow  expressions  is  defined  and  some  of  its  formal  properties  are 
presented.  Applications  are  exhibited  in  the  modeling  of  concurrent  programs,  the  description  of  operating  sys¬ 
tem  architectures,  the  specification  and  solution  of  synchronization  problems,  the  flow  and  description  of  com¬ 
mand  languages,  and  in  systems  analysis  and  verification. 

[Shaw89]  Abstract:  Halstead’s  theory  of  software  science  is  used  to  describe  the  compilation  process  and  gen¬ 
erate  a  compiler  performance  index.  A  nonlinear  model  of  compile  time  is  estimated  for  four  Ada  compilers.  A 
fundamental  relation  between  compile  time  and  program  modularity  is  proposed.  Issues  considered  include  data 
collection  procedures,  the  development  of  a  counting  strategy,  the  analysis  of  the  complexity  measures  used,  and 
the  investigation  of  significant  relationships  between  program  characteristics  and  compile  time.  The  results  sug¬ 
gest  that  the  model  has  a  high  predictive  power  and  provides  interesting  insights  into  compiler  performance 
phenomena.  The  research  suggests  that  the  discrimination  rate  of  a  compiler  is  a  valuable  performance  index  and 
is  preferred  to  average  compile  time  statistics. 

[Shei81]  Abstract:  Most  innovations  in  programming  languages  and  methodology  are  motivated  by  a  belief  that 
they  will  improve  the  performance  of  the  programmers  who  use  them.  Although  such  claims  are  usually 
advanced  informally,  there  is  a  growing  body  of  research  which  attempts  to  verify  them  by  controlled  observation 
of  programmers’  behavior.  Surprisingly,  these  studies  have  found  few  clear  effects  of  changes  in  either  program¬ 
ming  notation  or  practice.  Less  surprisingly,  the  computing  community  has  paid  relatively  little  attention  to  these 
results.  This  paper  reviews  the  psychological  research  on  programming  and  argues  that  its  ineffectiveness  is  the 
result  of  both  unsophisticated  experimental  technique  and  a  shallow  view  of  the  nature  of  programming  skill. 

[Shen83]  Abstract:  The  theory  of  software  science  was  developed  by  the  late  M.H.  Halstead  of  Purdue  Univer¬ 
sity  during  the  early  1970’s.  It  was  first  presented  in  unified  form  in  the  monograph  “Elements  of  Software  Sci¬ 
ence”  published  by  Elsevier  North-Holland  in  1977.  Since  it  claimed  to  apply  scientific  method  to  the  very  com¬ 
plex  and  important  problem  of  software  production,  and  since  experimental  evidence  supplied  by  Halstead  and 
others  seemed  to  support  the  theory,  it  drew  widespread  attention  from  the  computer  science  community. 

Some  researchers  have  raised  serious  questions  about  the  underlying  theory  of  software  science.  At  the 
same  time,  experimental  evidence  supporting  some  of  the  metrics  continues  to  be  presented.  This  paper  is  a  cri¬ 
tique  of  the  theory  as  presented  by  Halstead  and  a  review  of  experimental  results  concerning  software  science 
metrics  published  since  1977. 

[Shen85]  Abstract:  A  major  portion  of  the  effort  expended  in  developing  commercial  software  today  is  associ¬ 
ated  with  program  testing.  Schedule  and/or  resource  constraints  frequently  require  that  testing  be  conducted  so 
as  to  uncover  the  greatest  number  of  errors  possible  in  the  time  allowed.  In  this  paper  we  describe  a  study  under¬ 
taken  to  assess  the  potential  usefulness  of  various  product-  and  process-related  measures  in  identifying  error- 
prone  software.  Our  goal  was  to  establish  an  empirical  basis  for  the  efficient  utilization  of  limited  testing 
resources  using  objective,  measurable  criteria.  Through  a  detailed  analysis  of  three  software  products  and  their 
error  discovery  histories,  we  have  found  simple  metrics  related  to  the  amount  of  data  and  the  structural  complex¬ 
ity  of  programs  to  be  of  value  for  this  purpose. 

[Shep78]  Overview:  The  late  70’s  find  structured  programming  increasingly  popular — this  and  other  techniques 
are  programming’s  future.  But  what  does  experimental  evaluation  say  about  their  actual  effects  on  programmer 
performance? 

[Shep79]  Abbreviated  Introduction:  In  a  series  of  experiments  we  investigated  the  effects  of  modern  coding 
practices  on  three  different  programming  tasks.  The  first  experiment  examined  the  effects  of  structured  coding 
and  mnemonic  variable  names  on  program  comprehension.  The  second  studied  the  influence  of  structured 
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coding  and  commenting  style  on  modification  tasks.  The  third  studied  the  influence  of  structured  coding  and  of 
several  code-structuring  methods  on  debugging  performance.  Participants  in  these  experiments  were  all  profes¬ 
sional  programmers  whose  experience  ranged  from  several  months  to  25  years  and  averaged  six  or  more  years. 
Participants  in  each  experiment  were  selected  from  several  locations  in  order  to  increase  the  diversity  of  pro¬ 
gramming  backgrounds. 

[Shlm88]  Abbreviated  Introduction:  Reliability  is  a  pressing  concern  in  the  development  of  software  for  modern 
systems.  Many  techniques  have  been  proposed  to  improve  software  reliability.  One  technique,  N-Version  Pro¬ 
gramming,  has  been  used  in  software  to  control  aircraft  and  railroads  and  has  been  proposed  for  nuclear  power 
plants.  One  drawback  to  the  n-version  technique  is  that  the  total  development  costs  are  increased  due  to  the 
costs  of  developing  multiple  versions. 

In  order  to  make  the  technique  affordable,  it  has  been  suggested  that  n-version  programming  will  be  so 
effective  that  it  can  be  used  as  a  partial  substitute  for  current  software  verification  and  validation  procedures.  It 
seems  important  to  investigate  the  hypothesis  that  testing  can  be  reduced  in  n-version  systems,  and  in  general,  to 
study  the  relationship  between  fault  elimination  techniques  and  fault  tolerance  techniques. 

There  have  also  been  proposals  to  use  n-version  voting  in  the  testing  process.  In  this  method,  the  vote 
itself  is  used  as  the  test  oracle,  and,  therefore,  a  larger  number  of  tests  can  be  executed.  The  underlying  assump¬ 
tions  here  are  that  (1)  given  that  a  fault  leads  to  an  erroneous  output,  it  will  be  detected  by  the  voting  process, 
and  (2)  the  faults  that  would  have  been  detected  by  other  testing  techniques,  such  as  structural  testing  or  static 
analysis  techniques,  will  be  elicited  and  detected  by  voting  on  random  or  fimctional  test  cases  alone. 

The  authors  of  this  paper  are  engaged  in  a  large-scale  experiment  comparing  software  fault-tolerance  and 
software  fault  elimination  as  approaches  to  improving  software  quality.  This  paper  describes  the  experiment  and 
the  results  that  apply  to  the  appropriateness  and  underlying  assumptions  of  these  two  proposals. 

[Shne75]  Abbreviated  Background:  In  the  early  stages  of  the  development  of  high-level  languages,  radically 
differing  alternatives  were  often  promulgated.  Now  as  the  field  matures,  there  is  a  widespread  recognition  of  the 
usefulness  of  a  variety  of  languages . 

Although  Dijkstra  explicitly  stated  that  computer  programming  was  primarily  a  human  activity  as  early  as 
1965,  it  was  not  until  the  publication,  in  1971,  of  Gerald  Weinberg’s  text  The  Psychology  of  Computer  Program¬ 
ming  that  this  notion  was  widely  recognized.  [This]  text  concentrates  on  defining  the  programming  task  in  the 
context  of  the  professional  environment  and  promotes  the  notion  of  “egoless  programming  teams.”  This  team 
organization  concept  may  be  contrasted  with  the  “chief  programmer  team”  strategy  advocated  by  IBM.  Experi¬ 
mental  comparison  of  interactions  in  these  personal  organization  strategies  would  be  an  intriguing  task  for  social 
psychologists.  Other  sections  of  Weinberg’s  book  concentrate  on  individual  personality  factors,  training,  and 
motivational  factors.  Much  more  research  needs  to  be  done  on  the  psychological  make-up  of  programmers.  For¬ 
tunately,  psychologists  have  begun  to  study  programming  behavior  as  an  aspect  of  problem  solving.  Training  and 
teaching  of  programming  has  long  been  of  interest  to  academically  oriented  researchers.  Programming  has  only 
recently  become  a  subject  for  related  disciplines  such  as  educational  psychology. 

Although  experimentation  in  the  above  mentioned  areas  would  undoubtedly  be  welcome,  the  focus  of  this 
paper  is  on  experiments  in  programming  language  features,  stylistic  considerations  and  design  techniques. 

[Shne77a]  Abstract:  This  paper  describes  previous  research  on  flowcharts  and  a  series  of  controlled  experiments 
to  test  the  utility  of  detailed  flowcharts  as  an  aid  to  program  composition,  comprehension,  debugging,  and  modif¬ 
ication.  No  statistically  significant  difference  between  flowchart  and  nonflowchart  groups  has  been  shown, 
thereby  calling  into  question  the  utility  of  detailed  flowcharting.  A  program  of  further  research  is  suggested. 

[Shol75]  Abstract:  An  engineering-oriented  performance  model  of  a  computation  is  developed  by  extending  the 
concept  of  a  computation  structure  to  cover  the  performance  costs  appropriate  to  software  modeling.  The  model 
allows  both  serial  and  parallel  (multiprocessor)  configurations,  and  the  evaluation  of  both  time  and  space  param¬ 
eters  for  alternate  realizations. 

A  brief  discussion  on  the  use  of  the  model  as  a  mechanism  to  guide  the  performance  optimization  of 


341 


August  9, 1989 


programs  is  included. 

[Shoo72]  Abstract:  This  paper  discusses  a  probabilistic  model  for  predicting  software  reliability.  The  model  con¬ 
stants  are  calculated  from  debugging  data  collected  from  similar  previous  programs.  The  calculations  result  in  a 
decreasing  probability  of  number  of  software  errors  vs.  operating  time  (reliability  function).  The  decay  rate  of 
the  reliability  function  (reciprocal  of  the  mean  time  to  failure)  decreases  as  a  function  of  the  man-months  of 
debugging  time.  The  model  provides  initial  estimates  of  software  reliability  before  any  code  is  written  and  allows 
later  updating  to  improve  the  accuracy  of  the  parameters  when  integration  or  operational  test  begin. 

[Shoo75]  Abstract:  In  order  to  develop  some  basic  information  on  software  errors,  an  experiment  in  collecting 
data  on  types  and  frequencies  of  such  errors  was  conducted  at  Bell  Laboratories. 

Tlie  paper  reports  the  results  of  this  experiment,  whose  objectives  were  to:  (1)  Develop  and  utilize  a  set  of 
terms  for  describing  possible  types  of  errors,  their  nature,  and  their  frequency;  (2)  Perform  a  pilot  study  to  deter¬ 
mine  if  data  of  the  type  reported  in  this  paper  could  be  collected;  (3)  Investigate  the  error  density  and  its 
correspondence  to  predictions  from  previous  data  reported;  (4)  Develop  data  on  how  resources  are  expended  in 
debugging. 

A  program  of  approximately  4K  machine  instructions  (final  size)  was  chosen.  Programmers  were  asked  to 
fill  out  for  each  error,  in  addition  to  the  regular  Trouble  Report/Correction  Report  (TR/CR)  form,  a  special 
Supplementary  TR/CR  form  for  the  purpose  of  this  experiment.  Sixty-three  TR/CR  and  Supplementary  forms 
were  completed  during  the  Test  and  Integration  phase  of  the  program. 

In  general,  the  data  collected  were  felt  to  be  accurate  enough  for  the  purpose  of  the  analyses  presented. 
The  63  forms  represented  a  little  over  1-1/2%  of  the  total  number  of  machine  instructions  of  the  program.  (In 
good  agreement  with  the  1%  to  2%  range  noted  on  previous  studies.) 

It  was  discovered  that  a  large  percentage  of  the  errors  was  found  by  hand  processing  (without  the  aid  of  a 
computer).  This  method  was  found  to.be  much  cheaper  than  techniques  involving  machine  testing. 

[Shoo76]  Abstract:  Many  previous  software  reliability  prediction  models  by  this  author  and  others  have  concen¬ 
trated  on  the  bulk  (macro)  aspects  of  the  program.  This  paper  describes  a  newly  developed  micro  model  which 
is  based  on  program  structure. 

It  is  assumed  that  the  program  has  been  written  in  structured  or  modular  form  so  that  decomposition  into 
its  constituent  parts  is  simple.  Further,  we  assume  that  via  analysis  of  the  program  the  decomposition  can  be 
related  to  several  paths  or  other  functional  structures  within  the  program. 

The  model  is  constructed  based  upon  the  frequencies  with  which  each  of  the  j  paths  are  run,  (f.),  the  run¬ 
ning  time  of  each  path,  (t.),  and  the  probability  of  error  along  each  path,  (q-). 

Several  methods  of  calculating  or  measuring  the  f.,  t.,  and  q-  parameters  are  suggested.  In  fact  it  is  possi¬ 
ble  to  use  one  technique  (historical  data)  to  produce  crude  estimates  at  the  start  of  the  design,  and  refine  the  esti¬ 
mates  with  more  accurate  values  as  the  design  progresses. 

The  paper  concludes  with  the  application  of  the  model  to  a  particular  example:  calculation  of  the  roots  of 
a  quadratic  equation,  and  a  discussion  of  proposed  experiments  for  validating  the  model. 

[Shoo77a]  Abstract:  The  paper  begins  by  describing  the  types  and  causes  of  software  errors  and  provides  work¬ 
ing  definitions  of  software  errors  and  software  reliability.  Some  of  the  basic  data  on  frequency  of  occurrence  of 
errors  is  then  discussed.  The  paper  then  summarizes  and  references  some  of  the  software  reliability  models 
which  have  been  proposed  and  concentrates  on  one  developed  by  the  author.  One  of  the  probabilistic  models, 
the  macro  model,  predicts  reliability  based  on  the  initial  number  of  errors  in  a  program,  the  number  removed, 
and  the  number  remaining  in  the  program.  The  model  constants  are  calculated  from  operational  test  data  taken 
on  the  software  performance.  The  other,  the  micro  model,  focuses  on  the  paths  in  the  program,  their  frequency 
and  time  of  traversal,  and  the  error  rate  along  these  paths. 

[Shoo79]  Abbreviated  Abstract:  This  interim  report  summarizes  the  research  performed  by  Polytechnic  Insti¬ 
tute  of  New  York  for  Rome  Air  Development  Center  under  contract  F30602-78-C-0057.  The  principal  topics 
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covered  are  (1)  software  test  models  and  implementation  of  automated  test  drivers  to  force-execute  every  pro¬ 
gram  path,  (2)  development  of  new  measures  of  program  complexity  based  upon  information  theory,  (3)  models 
of  software  management  and  organizational  structure,  and  (4)  statistical  measures  relating  the  probability  of  find¬ 
ing  a  program  error  to  the  testing  of  that  program. 

Recursive  function  theory  was  applied  to  the  problem  of  program  complexity.  This  study  was  completed 
and  a  technical  report  was  issued.  The  present  report  contains  the  abstract  of  the  technical  reports. 

The  inquiry  into  the  number  of  tests  necessary  to  verify  a  computer  program  was  undertaken.  One  phase 
of  this  study  was  completed,  and  a  technical  report  was  issued.  The  present  report  contains  the  abstract  of  the 
technical  report. 

A  study  was  undertaken  of  software  test  models  and  of  the  implementation  of  associated  test  drivers.  The 
present  report  describes  this  work  as  well  as  the  test  drivers  obtained  so  far. 

A  new  measure  of  complexity  based  upon  information  theory  is  introduced.  This  measure  assumes  that  a 
language  feature  used  infrequently  is  more  likely  to  be  used  incorrectly  than  a  language  feature  used  frequently. 
The  measure  has  the  advantage  of  being  sensitive  to  the  different  levels  of  nestings  in  either  IPS,  DO’S,  or  pro¬ 
cedures. 

A  number  of  different  schemes  are  suggested  for  the  calculation  of  the  measure.  A  method  for  automatic 
calculation  of  the  measure  at  an  installation  is  also  discussed. 

The  relation  between  program  complexity  and  the  program’s  information  content  was  also  investigated. 
The  results  obtained  so  far  are  described  in  this  report. 

Two  models  for  the  management  of  software  were  investigated.  The  first  one  models  the  productivity 
(measured  in  instructions  per  months),  as  well  as  the  man-months  required.  The  second  model  investigates  dif¬ 
ferent  communication  schemes  that  can  be  evolved  when  a  problem  is  partitioned  into  several  subproblems. 

The  concluding  section  section  of  the  report  describes  the  planned  work  in  the  next  period  and  lists  pro¬ 
fessional  activities  of  the  personnel  during  the  present  reporting  period. 

[SldhS9]  Abstract:  A  protocol  standard,  in  general,  can  lead  to  different  implementations,  which  necessitate  the 
need  for  conformance  testing  of  an  implementation  to  its  standard.  Testing  is  carried  out  with  the  help  of  a  test 
sequence  generated  from  a  protocol  specification.  This  paper  presents  a  detailed  study  of  four  formal  methods 
(T-,  U-,  D-,  and  W-methods)  for  generating  test  sequences  for  protocols.  Applications  of  these  methods  to  NBS 
Class  4  Transport  Protocol  are  discussed.  This  paper  also  presents  an  estimation  of  fault  coverage  of  four  proto¬ 
col  test  sequences  generation  techniques  using  Monte  Carlo  simulation.  The  ability  of  a  test  sequence  to  decide 
whether  a  protocol  implementation  conforms  to  its  specification  heavily  relies  upon  the  range  of  faults  that  it  can 
capture.  Conformance  is  defined  at  two  levels,  namely,  weak  and  strong  conformance.  This  study  shows  that  a 
test  sequence  produced  by  T-method  has  a  poor  fault  detection  capability  whereas  test  sequences  produced  by 
U-,  D-  and  W-methods  have  comparable  (superior  to  that  for  T-method)  fault  coverage  on  several  classes  of  ran¬ 
domly  generated  machines  used  in  this  study.  Also,  some  problems  with  a  straightforward  application  of  the  four 
protocol  test  sequence  generation  methods  to  real-world  communication  problems  are  pointed  out. 

[Sief88]  Abstract:  The  purpose  of  this  project  was  to  develop  a  tool  to  automate  the  method  for  evaluating 
software  quality  in  Software  Quality  Evaluation  Guidebook  RADC-TR -85-37  Vol  III  (of  three).  The  Automated 
Measurement  System  (AMS),  a  computer-based  software  tool,  provides  the  capabilities  to  monitor  the  overall 
quality  and  resource  expenditure  of  software  under  development.  The  AMS  collects,  stores  and  analyzes 
software  measurement  data  for  use  by  software  acquisition  and  software  project  personnel.  It  provides  managers 
with  a  means  to  quantitatively  specify  goals  and  track  progress  toward  those  goals  during  all  phases  of  the 
software  life  cycle  (in  concert  with  DOD-STD-2167).  The  underlying  philosophy  of  the  AMS  is  based  on  a 
framework  consisting  of  a  set  of  13  software  factors  (i.e.,  reliability,  maintainability,  reusability,  portability, 
interoperability,  usability,  integrity,  flexibility,  expandability,  verifiability,  correctness,  survivability,  and  effi¬ 
ciency)  which  are  associated  with  high  level  concerns  of  software  quality. 

[Sldl89]  Abstract:  We  present  a  methodology  for  transforming  a  functional  specification  written  in  Lucid,  to  an 
equivalent  specification  that  captures  its  real-time  properties.  The  enhanced  specification  consists  of  a  set  of 
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equations.  These  equations  can  be  solved  for  several  properties,  including  execution  time  and  external  require¬ 
ments,  or  they  may  simply  be  checked  for  the  existence  of  a  solution.  Lucid  has  a  set  of  meaning-preserving 
transformations,  and  a  proof  system  corresponding  to  a  behavioral  semantics  has  been  constructed.  Both  of 
these  tools  can  be  used  to  reason  about  properties  of  the  specification. 

[Snee84]  Abstract:  The  data  processing  community  needs  to  apply  software  engineering  techniques  and  tools  to 
real  projects  to  determine  their  practical  usefulness.  Such  an  opportunity  was  provided  by  the  Bertelsmann  Pub¬ 
lishing  Corporation  of  Gutersloh,  West  Germany,  during  a  two  year  period  from  1981  to  1983.  This  article  reports 
the  results  of  that  project  and  the  experience  gained  from  it. 

[Snee85]  Abstract:  This  paper  describes  a  family  of  tools  which  not  only  supports  software  development,  but 
also  assures  the  quality  of  each  software  product  from  the  requirements  definition  to  the  integrated  system.  It  is 
based  upon  an  explicit  definition  of  the  design  objectives  and  includes  specification  verification,  design  evalua¬ 
tion,  static  program  analysis,  dynamic  program  analysis,  integration  test  auditing,  and  configuration  manage¬ 
ment. 

[SneeM]  Abstract:  The  following  paper  presents  a  metric  for  measuring  test  coverage  which  will  enhance  the 
present  test  metrics.  This  measurement  focuses  on  the  data  used  by  the  program  under  test.  By  dynamically  mon¬ 
itoring  the  change  of  data  states  at  test  time  it  is  possible  to  record  how  data  are  actually  used.  By  statically 
analyzing  the  operands  of  a  program  it  is  possible  to  record  how  they  are  referenced  by  the  program.  By  analyz¬ 
ing  the  specification  it  is  possible  to  derive  how  the  data  should  be  used.  Finally,  the  specified  use  is  compared 
with  the  programmed  use  and  the  programmed  use  with  the  tested  use,  in  order  to  determine  to  what  degree  all 
specified  data  usages  have  been  tested.  The  ratio  of  actual  tested  usage  to  the  specified  usage  gives  the  total  data 
coverage. 

[Solo84]  Abstract:  We  suggest  that  expert  programmers  have  and  use  two  types  of  programming  knowledge:  1) 
programming  plans,  which  are  generic  program  fragments  that  represent  stereotypic  action  sequences  in  pro¬ 
gramming,  and  2)  rules  of  programming  discourse,  which  capture  the  conventions  in  programming  and  govern  the 
composition  of  the  plans  into  programs.  We  report  here  on  two  empirical  studies  that  attempt  to  evaluate  the 
above  hypothesis.  Results  froa*  .uese  studies  do  in  fact  support  our  claim. 

[Sonc80]  Abstract:  A  finite  state,  continuous  time  Markov  model  is  presented,  which  provides  reliability  meas¬ 
ures  (i.e.,  expected  down  time)  for  duplicated  and  repairable  fault-tolerant  computing  systems  whose  main 
penalty  depends  on  the  total  duration  of  failures  over  a  given  time  period.  Most  of  the  existing  models  estimate 
reliability  measures  (e.g.,  mean  time  before  failure)  derived  from  the  reliability  function,  meaningful  only  for  sys¬ 
tems  whose  main  penalty  depends  on  the  frequency  of  failures.  In  addition,  the  model  described  here  removes 
the  simplifying  assumption,  made  by  some  of  the  previous  models,  that  the  system  is  made  of  independent  sub¬ 
systems  and  this  each  subsystem  can  be  modeled  separately.  Recognizing  the  fact  that  certain  faults  may  affect 
more  than  one  subsystem,  this  model  represents  the  entire  system,  assuming  however  a  small  number  of  dupli¬ 
cated  subsystem.  The  model  has  been  implemented  as  a  general  interactive  program  to  provide  speedy  estima¬ 
tion  of  reliability  measures  in  the  evaluation  of  fault-tolerant  computer  architecture  designs.  An  example  is 
included  to  illustrate  the  capability  of  the  model. 

[SoncSI]  Abstract: 

[Soon77]  Summary:  This  paper  contributes  to  the  understanding  of  program  structures  in  terms  of  its  stability 
and  reliability  in  a  quantitative  sense.  Distinctions  are  made  between  the  logical  structure  of  a  program  and  the 
information  structure  of  a  program. 

The  general  characteristics  of  a  good  program  will  not  be  discussed  in  this  paper  other  than  citing  relevant 
references.  The  term  stability  is  defined  as  the  resistance  to  the  amplification  of  changes  that  has  been  made  to  a 
given  program.  The  information  structure  of  a  program  is  based  on  the  sharing  of  information  between  the 
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components  of  the  program. 

Some  quantitative  analysis  is  derived  to  measure  the  quality  of  a  program  in  terms  of  its  information  struc¬ 
ture.  Hie  techniques  used  here  are  the  method  of  connectivity  matrix  and  that  of  random  Markovian  process.  A 
high  level  quantitative  measure  of  the  information  structure  will  be  presented  together  with  an  informal  proof  for 
the  uniqueness  and  the  existence  of  this  measure. 

Applying  this  technique,  several  simple  program  information  structures  are  measured  for  their  stability 
characteristics.  The  simplicity  of  the  structure  is  chosen  deliberately  so  that  this  technique  can  be  checked 
against  intuitive  preference  of  stability.  Another  example  with  a  structure  of  some  sophistication  is  presented  to 
compare  a  tree  structure  to  a  pair  of  cooperating  sequential  processes. 

[Sork79]  Abstract:  Here  is  a  presently  operational  plan  to  improve  the  quality  of  program  testing.  After  all  pro¬ 
grams  are  tested  alone,  an  independent  quality  control  staff  uses  automated  tools  to  certify  that  minimum  testing 
criteria  have  been  met. 

[Srin85]  Introduction:  The  dataflow  model  of  computation  allows  functions  to  be  run  concurrently  on  multiple 
processors,  reducing  execution  time  significantly.  This  advantage  and  the  partial  results  of  the  computation  will 
be  lost  if  processors  fail.  Therefore,  a  crucial  feature  in  a  concurrent  system  is  the  ability  to  continue  the  compu¬ 
tation  when  components  fail,  a  feature  known  as  fault  tolerance. 

Although  several  dataflow  architectures  have  been  proposed,  few  are  fault  tolerant  and  able  to  balance  the 
load  on  the  system  dynamically.  But  a  distributed  computer  system  (DCS)  based  on  a  task-level  dataflow  archi¬ 
tecture  can  reduce  traffic,  speed  communication  between  processors,  and  tolerate  hardware  faults  by  automati¬ 
cally  reassigning  computations  to  a  healthy  processor.  Such  a  DCS  has  the  potential  to  provide  better  perfor¬ 
mance  than  conventional  multiprocessors  because  the  execution  of  a  function  is  free  of  side  effects. 

By  asking  when  and  how  to  do  node  reassignment  as  the  dataflow  architecture  and  processor  are  designed, 
designers  can  incorporate  the  necessary  support  mechanisms.  This  article  considers  dataflow  graphs  with  nodes 
representing  asynchronous  tasks. 

[Stan&3]  Abstract:  Arcturus  offers  an  approach  to  the  integrated  use  of  compiled  and  interpreted  Ada,  tem¬ 
plate-driven  Ada  text  editing,  an  Ada  Program  Design  Language  (PDL),  Ada  program  performance  measure¬ 
ment  with  color  profiles,  formated  printing  of  Ada  programs  with  useful  listing  options,  and  automated  stepwise 
refinement  from  Ada  program  designs  written  in  Ada  PDL  into  executable  Ada.  This  paper  has  two  objectives: 
(a)  to  provide  a  two  page  thumbnail  sketch  of  some  interactive  Ada  capabilities  in  Arcturus,  and  (b)  to  provide  a 
detailed  scenario  of  interactive  Ada  programming  at  the  very  simplest  level,  using  Ada-oriented  variants  of 
interactive  programming  techniques  that  have  proven  effective  in  practice  —  the  hope  being  that  the  reader  will 
be  convinced  that  interactive  Ada  is  an  idea  worth  vigorous  further  pursuit. 

[Stan84a]  Abstract:  The  Arcturus  system  demonstrates  several  important  principles  that  will  characterize 
advanced  Ada  programming  support  environments.  These  include  conceptual  simplicity,  tight  coupling  of  tools, 
and  effective  command  and  editing  concepts.  Arcturus  supports  interactive  program  development  and  permits 
the  combined  used  of  interpretive  and  complied  execution.  Arcturus  is  not  complete  however,  as  practical, 
mature  environments  for  Ada  must  also  support  the  development,  analysis,  testing,  and  debugging  of  concurrent 
programs.  These  issues  are  currently  being  explored.  Arcturus,  therefore  is  a  platform  for  experimental 
exploration  of  key  programming  environment  issues.  This  paper  focuses  primarily  on  the  current  system, 
describing  and  illustrating  some  of  its  components,  while  issues  less  fully  developed  are  more  briefly  described. 

[StetM]  Abstract:  In  a  recent  paper,  an  approximate  formula  for  the  number  of  faults  per  line  of  code  was 
developed.  We  show  that  there  is  an  approximation  which  is  easier  to  develop,  more  accurate,  and  simpler  to 
use. 


[St*v74]  Abstract:  Considerations  and  techniques  are  proposed  that  reduce  the  complexity  of  programs  by  divid¬ 
ing  them  into  functional  modules.  This  can  make  it  possible  to  create  complex  systems  from  simple, 
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independent,  reusable  modules.  Debugging  and  modifying  programs,  reconfiguring  I/O  devices,  and  managing 
large  programming  projects  can  all  be  greatly  simplified.  And,  as  the  module  library  grows,  increasingly  sophisti¬ 
cated  programs  can  be  implemented  using  less  and  less  new  code. 

[Stlg74]  Abbreviated  Introduction:  The  complexity  of  digital  computers  and  their  large  scale  use  have  led  some 
researchers  to  investigate  tools  not  commonly  used.  In  recent  years,  applications  of  graph  theory  to  computers 
as  well  as  other  fields  of  study  have  given  fruitful  results  and  have  attracted  more  and  more  scientists.  The 
attempt  here  will  be  to  review  previous  accomplishments  on  a  fundamental  level  and  to  stimulate  the  reader  to 
investigate  an  area  where  valuable  work  is  being  performed. 

[Stuc72]  Abbreviated  Introduction:  The  measurement  process  plays  a  vital  role  in  the  quality  assurance  and 
testing  of  new  hardware  systems.  To  insure  the  reliability  of  the  final  hardware  system,  each  stage  of  development 
incorporates  performance  standards  and  testing  procedures.  The  establishment  of  software  performance  criteria 
has  been  very  nebulous.  At  first  the  desire  to  “just  get  it  working”  prevailed  in  most  software  development 
efforts.  With  the  increasing  complexity  of  new  and  evolving  software  systems,  improved  measurement  tech¬ 
niques  are  needed  to  facilitate  disciplined  program  testing  beyond  merely  debugging.  The  Program  Testing  Trans¬ 
lator  is  an  automatic  tool  designed  to  aid  in  the  measurement  and  testing  of  software  systems. 

Early  attempts  at  the  application  of  measurement  techniques  to  software  dealt  mainly  with  efforts  to  meas¬ 
ure  the  hardware  utilization  characteristics.  In  an  attempt  to  further  improve  hardware  utilization,  several  aids 
have  been  developed  ranging  from  optimized  compilers  to  automated  execution  monitoring  systems.  The  Pro¬ 
gram  Testing  Translator,  designed  to  aid  in  the  testing  of  programs,  goes  further.  In  addition  to  providing  execu¬ 
tion  time  statistics  on  the  frequency  of  execution  for  various  program  statements,  the  Program  Testing  Translator 
performs  a  “standards”  check  to  insure  programmers’  compliance  to  an  established  coding  standard,  gathers 
data  on  the  extent  to  which  various  branches  of  a  program  are  executed,  and  provides  data  range  values  on 
assignment  statements  and  DO-loop  control  variables. 

[Stnc75a]  Abstract:  Automated  tools  and  structured  programming  techniques  are  in  use  on  a  variety  of  scientific 
and  business  application  programming  projects  within  the  McDonnell  Douglas  Corp.  An  examination  of  the 
resulting  programs  reveals  certain  development  and  maintenance  characteristics  that  suggest  new  and  very 
interesting  applications  for  automated  tools. 

An  extension  of  PET  (a  currently  operational  McDonnell  Douglas  validation  tool  for  FORTRAN)  to 
include  a  user  embedded  assertion  capability  offers  a  step  in  the  direction  of  automatically  verifying  the  dynamic 
execution  of  programs.  A  user-oriented  local  and  global  assertion  capability  is  introduced  and  its  implementa¬ 
tion  is  discussed. 

Application  of  these  tools  within  a  well-conceived  structured  programming  environment  offers  a  positive 
step  forward  in  the  development  of  more  reliable  software  systems. 

[Stuc77]  Abbreviated  Background:  Another  set  of  tools  has  also  been  introduced  over  the  last  few  years  to  deal 
with  control  flow  through  programs.  Software  probes  or  instrumentation  are  automatically  placed  into  a  program 
for  monitoring  the  dynamic  execution  behavior  of  an  algorithm.  Software  probes  in  the  form  of  source  language 
statements  are  inserted  into  the  source  code  to  gather  statistics  during  program  execution.  These  probes  can  pro¬ 
vide  insight  into  many  aspects  of  algorithmic  behavior  beyond  a  simple  flow  of  control  analysis.  The  notion  of 
building  self-metric  software  has  been  introduced  previously  by  the  author;  however,  significant  expansion  of  this 
concept  is  now  being  explored  as  a  vehicle  for  improving  software  quality. 

In  order  to  illustrate  the  type  of  automated  tool  capabilities  currently  available  and  some  of  the  new  tech¬ 
niques  now  under  construction,  the  tool  most  familiar  to  the  author  will  be  described.  It  is  hoped  that  this 
currently  operational  system  will  offer  some  insight  into  the  concept  of  self-metric  software  and  show  a  few  of  the 
measurement  schemes  available  for  dynamic  program  analysis. 

[Suke77a]  Abstract:  Reports  on  the  initial  phase  of  a  software  reliability  modeling  study,  in  which  nine  software 
reliability  models  are  applied  against  software  error  data  detailing  the  complete  error  history  from  the  start  of 
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formal  testing  through  delivery  of  a  large  command  and  control  software  development  project  with  over  100,000 
lines  of  Jovial  code.  The  paper  describes  the  models  considered  and  the  procedures  used  to  prepare  the  data  for 
model  input.  Model  predictions  are  then  compared  and  analyzed  against  the  actual  post-delivery  error  data  for 
this  project.  From  this  analysis,  conclusions  concerning  model  applicability  and  some  possible  extensions  of  this 
study  are  discussed. 

[Suke79]  Abstract:  From  1974  Aug  to  1978  May  a  study  to  validate  several  mathematical  models  for  predicting 
the  reliability  and  error  content  of  a  software  package  against  error  data  extracted  from  four  large  U.S.A. 
Department  of  Defense  software  development  projects  was  undertaken  by  Rome  Air  Development  Center.  This 
paper  will  describe  the  results  of  this  empirical  study  for  three  such  models:  Jelinski-Moranda,  Schick- Wolver- 
ton,  modified  Schick-Wolverton.  Model  predictions  will  be  compared  on  a  total  project,  functional,  and  error 
severity  basis,  and  on  a  daily  vs.  weekly  basis  for  defining  model  time  intervals.  The  question  of  when  to  begin 
applying  these  models  will  be  addressed.  General  conclusions  are  drawn  as  to  model  applicability. 

[Sono82]  Abstract:  The  program  complexity  measure  currently  seems  to  be  the  most  capable  measure  for  both 
quantitative  and  objective  control  of  the  software  project.  Five  program  complexity  measures  (step  count, 
McCabe’s  V(G),  Halstead’s  E,  Weighted  Statement  Count  and  Process  V(G))  were  assessed  from  such  a 
viewpoint.  This  empirical  study  was  done  with  the  data  collected  through  a  practical  software  project.  All  of 
these  measures  have  highly  significant  correlations  with  the  management  data.  Application  of  complexity  meas¬ 
ures  to  software  development  management  is  discussed  and  a  method  for  the  detection  of  anomalous  modules  in 
a  program  is  proposed. 

[Suns77]  Abstract:  It  is  becoming  increasingly  important  that  communication  protocols  be  formally  specified 
and  verified.  This  paper  describes  a  particular  approach  -  the  state  transition  model  -  using  a  collection  of 
mechanically  supported  specification  and  verification  tools  incorporated  in  a  running  system  called  AFFIRM. 
Although  developed  for  the  specification  of  abstract  data  types  and  the  verification  of  their  properties,  the  for¬ 
malism  embodied  in  AFFIRM  can  also  express  the  concepts  underlying  state  transition  machines.  Such  models 
easily  express  most  of  the  events  occurring  in  protocol  systems  including  those  of  the  users,  their  agent 
processes,  and  the  communication  channels.  The  paper  reviews  the  basic  concepts  of  state  transition  models, 
and  the  AFFIRM  formalism  and  methodology  and  describes  their  union.  A  detailed  example,  the  alternating  bit 
protocol,  illustrates  various  properties  of  interest  for  specification  and  verification.  Other  examples  explored 
using  this  formalism  are  briefly  described  and  the  accumulated  experience  is  discussed. 

[Symo88]  Abstract:  The  method  of  Function  Point  Analysis  was  developed  by  Allan  Albrecht  to  help  measure 
the  size  of  a  computerized  business  information  system.  Such  sizes  are  needed  as  a  component  of  the  measure¬ 
ment  of  productivity  in  system  development  and  maintenance  activities,  and  as  a  component  of  estimating  the 
effort  needed  for  such  activities.  Close  examination  of  the  method  shows  certain  weaknesses,  and  the  author 
proposes  a  partial  alternative.  The  paper  describes  the  principles  of  this  “Mark  II”  approach,  the  results  of  some 
measurements  of  actual  systems  to  calibrate  the  Mark  U  approach,  and  conclusions  on  the  validity  and  applica¬ 
bility  of  function  point  analysis  generally. 

[Szul84]  Introduction:  The  manner  in  which  software  for  DoD  applications  is  developed  is  undergoing  evolu¬ 
tionary  change  with  the  introduction  of  Ada  and  its  support  tools.  This  change  has  been  prompted  by  the  desire 
to  increase  software  quality  and  developer  productivity.  Although  design-aid  tools,  and  techniques  for  measuring 
software  quality  have  been  of  interest  to  the  research  community  for  some  time  now,  the  user  community  has 
only  recently  expressed  a  need  for  this  technology  as  evidenced  by  the  Software  Technology  for  Adaptable  and 
Reliable  Systems  (STARS)  program.  An  important  part  of  the  STARS  program  is  the  development  of  metrics  to 
measure  the  quality  of  both  the  software  development  process  and  software  products.  Even  though  the  STARS 
focus  is  not  Ada,  tools  and  techniques  developed  through  this  effort  will  likely  become  part  of  the  Ada  Program¬ 
ming  Support  Environments  (APSEs). 

This  paper  reports  on  work  done  in  investigating  the  use  of  Ada  as  a  Program  Design  Language  (PDL), 
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and  the  evaluation  of  Ada  designs  with  a  design  metric.  The  first  section  provides  background  and  describes  the 
context  for  the  work.  The  second  section  defines  the  Halstead  metrics  and  discusses  their  application  during  the 
design  phase.  The  third  section  discusses  using  Ada  as  a  Program  Design  Language.  The  fourth  section  presents 
an  example  which  illustrates  the  usefulness  of  the  design  metrics  on  the  Ada  PDL  design  medium.  Finally,  the 
conclusions  of  this  work  are  presented. 

[Tai80]  Abstract:  This  paper  explores  the  testing  complexity  of  several  classes  of  programs,  where  the  testing 
complexity  is  measured  in  terms  of  the  number  of  test  data  required  for  demonstrating  program  correctness  by 
testing.  It  is  shown  that  even  for  very  restrictive  classes  of  programs,  none  of  the  commonly  used  criteria,  namely 
having  every  statement,  branch,  and  path  executed  at  least  once,  is  nearly  sufficient  to  guarantee  absence  of 
errors. 

Based  on  the  study  of  testing  complexity,  this  paper  proposes  two  new  test  criteria,  one  for  testing  a  path 
and  the  other  for  testing  a  program.  These  new  criterion  suggest  how  to  select  test  data  to  obtain  confidence  in 
program  correctness  beyond  the  requirement  of  having  each  statement,  branch,  or  path  tested  at  least  once. 

[Tai85a]  Abstract:  In  this  paper  a  type  of  error  in  concurrent  software,  called  synchronization  error,  is  defined. 
How  to  analyze  a  concurrent  specification  or  design  in  order  to  detect  synchronization  errors  is  discussed. 

[Tai85c]  Abstract:  Repeated  executions  of  a  concurrent  program  with  the  same  input  may  exercise  different 
paths  in  this  program,  thus  making  concurrent  programs  more  difficult  to  test  than  sequential  programs.  This 
paper  addresses  several  fundamental  issues  on  the  testing  of  concurrent  programs.  A  type  of  error  in  concurrent 
programs,  called  synchronization  error,  is  formally  defined.  To  detect  such  errors,  a  new  form  of  test  case  is  pro¬ 
posed,  which  consists  of  an  input  and  a  synchronization  sequence  and  is  called  an  IN_SYN  test  case.  How  to 
generate  IN_SYN  test  cases  for  a  concurrent  program  is  discussed.  In  order  to  execute  an  IN_SYM  test  case, 
the  problem  of  reproducing  a  sequence  of  synchronizations  between  concurrent  processes  arises,  which  is 
referred  to  as  the  reproducible  testing  problem.  Four  basic  approaches  to  solving  this  problem  are  presented. 

[Tal86]  Abstract:  Repeated  executions  of  a  concurrent  Ada  program  with  the  same  input  may  exercise  different 
sequences  of  rendezvous,  thus  making  concurrent  Ada  programs  more  difficult  to  test  than  sequential  Ada  pro¬ 
grams.  The  reproducible  testing  problem  for  Ada  is  how  to  reproduce  a  sequence  of  rendezvous  of  a  concurrent 
Ada  program.  This  problem  exists  not  only  for  debugging  concurrent  Ada  programs,  but  also  for  determining 
the  correctness  of  concurrent  Ada  programs. 

In  this  paper,  we  present  a  solution  to  the  reproducible  testing  problem  for  an  arbitrary  concurrent  Ada 
program.  This  solution  transforms  a  concurrent  Ada  program  P  into  P’  such  that  the  reproduction  of  a  sequence 
of  rendezvous,  say  S,  of  P  with  input  X  requires  exactly  one  execution  of  P’  with  (X,S)  as  input.  The  proposed 
solution  can  be  easily  automated. 

[Taka89]  Abstract:  Accuracy  in  program  error  prediction  is  a  major  problem  in  quality  control  of  a  large-scale 
software  system.  This -paper  presents  a  model  to  estimate  the  number  of  errors  remaining  in  a  program  at  the 
beginning  of  the  testing  phase  of  development.  In  the  first  part  of  the  study,  the  relationships  between  the  errors 
occurring  in  a  program  and  the  various  factors  which  have  an  effect  on  software  development,  such  as  program¬ 
mer’s  skill,  are  statistically  analyzed.  The  model  is  then  derived  by  using  the  factors  significantly  identified  in  the 
analysis.  This  empirical  study  is  based  on  data  collected  during  the  development  of  large-scale  software  systems. 
Results  of  the  study  indicate  that  factors  such  as  frequency  of  program  specification  change,  programmer’s  skill, 
and  volume  of  program  design  documentation  are  significant  and  that  the  model  based  on  these  factors  is  more 
reliable  than  conventional  error  prediction  methods  based  on  program  size  alone. 

[Taus88]  Abstract:  This  article  contains  the  results  of  initial  research  work  performed  to  extend  the  applicability 
of  McCabe’s  Cyclomatic  Complexity  Metric  for  the  analysis  of  Ada  software.  Having  proved  useful  both  as  a  log¬ 
ical  measurement  technique  and  as  a  testing  aid,  the  Ada  Complexity  Extension  (ACE)  is  proposed  for  general 
acceptance  as  a  standard  to  provide  a  useful  metric  that  may  assist  in  improving  the  quality  of  Ada  software 
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programs. 

[Tajrl78b]  Abstract:  This  paper  describes  the  overall  design  oi  some  modular  capabilities  for  error  detection 
testing,  verification,  and  documentation  of  concurrent  process  HAL/S  programs.  The  work  described  draws 
upon  many  ideas  first  advanced  in  building  tools  for  single  process  software.  In  this  paper,  these  ideas  are  signifi¬ 
cantly  extended  and  adapted  to  realize  the  power  of  these  tools  for  concurrent  software.  Particular  attention  is 
paid  to  the  design  of  static  data  flow  analysis  capabilities  for  concurrent  software. 

[TayttOa]  Abstract:  The  increasing  cost  of  computer  system  failure  has  stimulated  interest  in  improving  software 
reliability.  One  way  to  do  this  is  by  adding  redundant  structural  data  to  data  structures.  Such  redundancy  can  be 
used  to  detect  and  correct  (structural)  errors  in  instances  of  a  data  structure.  The  intuitive  approach  of  this 
paper,  which  makes  heavy  use  of  examples,  is  complemented  by  the  more  formal  development  of  the  companion 
paper,  “Redundancy  in  Data  Structures:  Some  Theoretical  Results.” 

[Tayi80b]  Abstract:  Algorithms  are  presented  for  detecting  errors  and  other  anomalies  in  programs  which  use 
synchronization  constructs  to  implement  concurrency.  The  algorithms  employ  data  flow  analysis  techniques. 
First  used  in  compiler  object  code  optimization,  the  techniques  have  more  recently  been  used  in  the  detection  of 
variable  usage  errors  in  single  process  programs.  By  adapting  these  existing  algorithms,  the  same  class  of  variable 
usage  errors  can  be  detected  in  concurrent  process  programs.  Important  classes  of  errors  unique  to  concurrent 
process  programs  are  also  described,  and  algorithms  for  their  detection  are  presented. 

[Tayl82a]  Abstract:  This  paper  sets  some  context,  raises  issues,  and  provides  [the  authors]  initial  thinking  on  the 
characteristics  of  effective  rapid  prototyping  techniques. 

After  discussing  the  role  rapid  prototyping  techniques  can  play  in  the  software  lifecycle,  the  paper  looks 
at  possible  technical  approaches  including:  heavily  parameterized  models,  reusable  software,  rapid  prototyping 
languages,  prefabrication  techniques  for  system  generation,  and  reconfigurable  test  harnesses. 

The  paper  concludes  that  a  multi-faceted  approach  to  rapid  prototyping  techniques  is  needed  if  we  are  to 
address  a  broad  range  of  applications  successfully  —  no  single  technical  approach  suffices  for  all  potentially 
desirable  applications. 

[Tayl82b]  Abstract:  A  common  paradigm  for  the  development  of  process-control  or  embedded  computer 
software  is  to  do  most  of  the  implementation  and  testing  on  a  large  host  computer,  then  retarget  the  code  for  final 
checkout  and  production  execution  on  the  target  machine.  The  host  machine  is  usually  large  and  provides  a 
variety  of  program  development  tools,  while  the  target  may  be  a  small,  bare  machine.  A  difficulty  with  the  para¬ 
digm  arises  when  the  software  developed  has  real-time  constraints  and  is  composed  of  multiple  communicating 
processes.  If  a  test  execution  on  the  target  fails,  it  may  be  exceptionally  tedious  to  determine  the  cause  of  the 
failure.  Host  machine  debuggers  cannot  normally  be  applied,  because  the  same  program  processing  the  same 
data  will  frequently  exhibit  different  behavior  on  the  host.  Differences  in  processor  speed,  scheduling  algorithm, 
and  the  like,  account  for  the  disparity.  This  paper  proposes  a  partial  solution  to  this  problem,  in  which  the  errant 
execution  is  reconstructed  and  made  amenable  to  source  language  level  debugging  on  the  host.  The  solution 
involves  the  integrated  application  of  a  static  concurrency  analyzer,  an  interactive  interpreter,  and  a  graphic  pro¬ 
gram  visualization  aid.  Though  generally  applicable,  the  solution  is  described  here  in  the  context  of  multi-task 
real-time  Ada  programs. 

[Tayl83a]  Abstract:  Developing  and  verifying  concurrent  programs  presents  several  problems.  A  static  analysis 
algorithm  is  presented  here  that  addresses  the  following  problems:  how  processes  are  synchronized,  what  deter¬ 
mines  when  programs  are  run  in  parallel,  and  how  errors  are  detected  in  the  synchronization  structure.  Though 
the  research  focuses  on  Ada,  the  results  can  be  applied  to  other  concurrent  programming  languages  such  as 
CSP. 

[Tayi83b]  Summary:  Foundational  to  verification  of  some  aspects  of  communicating  concurrent  systems  is 
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knowledge  of  the  synchronization  which  may  occur  during  execution.  The  synchronization  determines  the 
actions  that  may  occur  in  parallel,  may  determine  program  data  flow,  and  may  also  lead  to  inherently  erroneous 
situations  (e.g.,  deadlock).  This  paper  formalizes  the  notion  of  the  synchronization  structure  of  concurrent  pro¬ 
grams  that  use  the  rendezvous  (or  similar  mechanism  for  achieving  synchronization).  The  formalism  is  oriented 
towards  supporting  verification  as  performed  by  automated  static  program  analysis.  Complexity  results  are 
presented  which  indicate  what  may  be  expected  in  this  area  and  which  also  shed  light  on  the  difficulty  of  correctly 
constructing  concurrent  systems.  Specifically,  most  of  the  analysis  tasks  considered  are  shown  to  be  intractable. 

[Tayl83c]  Abstract:  Stand-alone  techniques  for  the  analysis  and  testing  of  the  synchronization  structure  of  con¬ 
current  programs  have  recently  been  developed.  These  techniques  are  able  to  detect,  for  example,  task  block¬ 
age,  including  deadlock.  Static  analysis  provides  firm  results,  but  has  limited  applicability  and  is  potentially 
expensive.  Dynamic  analysis  makes  fewer  assumptions,  but  its  assurances  are  not  as  strong.  This  paper  presents 
strategies  whereby  the  two  can  be  employed  jointly  to  advantage.  Dynamic  analysis  can  be  used  to  further  investi¬ 
gate  results  from  static  analysis,  and  vice  versa.  Their  joint  use  can  be  facilitated  by  an  appropriate  implementa¬ 
tion,  some  principles  for  which  are  outlined. 

[Tayl85]  Abstract:  Conceptual  simplicity,  tight  coupling  of  tools,  and  effective  support  of  host-target  software 
development  will  characterize  advanced  Ada  programming  support  environments.  Several  important  principles 
have  been  demonstrated  in  the  Arcturus  system,  including  template-assisted  Ada  editing,  command  completion 
using  Ada  as  a  command  language,  and  combining  the  advantages  of  interpretation  and  compilation.  Other  prin¬ 
ciples,  relating  to  analysis,  testing,  and  debugging  of  concurrent  Ada  programs,  have  appeared  in  other  contexts. 
This  paper  discusses  several  of  these  topics,  considers  how  they  can  be  integrated,  and  argues  for  their  inclusion 
in  an  environment  appropriate  for  software  development  in  the  late  1980’s. 

[Tayl86a]  Abstract:  Though  structural  testing  techniques  are  among  the  weakest  available  with  regard  to 
developing  confidence  in  sequential  programs,  they  are  not  without  merit.  This  paper  extends  the  notion  of  struc¬ 
tural  testing  criteria  to  concurrent  programs  and  proposes  tools  supporting  structural  testing  techniques. 
Requisite  support  tools  include  a  static  concurrency  analyzer  and  either  a  program  transformation  system  or  a 
powerful  run-time  monitor.  Also  helpful  is  a  controllable  run-time  scheduler.  The  techniques  proposed  will  work 
for  Ada  or  CSP-like  languages.  Best  results  will  be  obtained  for  programs  having  only  static  naming  of  task 
objects. 

[Tayl86b]  Abstract:  The  research  objectives  of  the  Arcadia  project  are  twofold:  discovery  and  development  of 
environment  architecture  principles  and  creation  of  novel  software  development  tools.  The  environment  archi¬ 
tecture  is  intended  to  reconcile  extensibility  with  the  often  conflicting  goal  of  integration,  including  both  a  uni¬ 
form  user  interface  and  coordination  and  management  of  tools  and  software  objects.  Work  on  tools  is  focused  on 
analysis  of  software  objects  at  every  stage  of  software  development  and  maintenance,  and  is  especially  aimed  at 
analysis  of  concurrent  and  real-time  software.  A  prototype  environment  architecture  and  toolset  is  being 
developed  in  Ada,  to  support  Ada  software  development.  The  authors  describe  the  research  objectives  and 
approaches  being  taken,  the  organization  of  the  research  endeavor,  and  current  status  of  the  work. 

(Tayl88]  Abstract:  Early  software  environments  have  supported  a  narrow  range  of  activities  {programming 
environments)  or  else  been  restricted  to  a  single  "hard-wired”  software  development  process.  The  Arcadia 
research  project  is  investigating  the  construction  of  software  environments  that  are  tightly  integrated,  yet  flexible 
and  extensible  enough  to  support  experimentation  with  alternative  software  processes  and  tools.  This  had  led  us 
to  view  an  environment  as  being  composed  of  two  distinct,  cooperating  parts.  One  is  the  variant  part,  consisting 
of  process  programs  and  the  tools  and  objects  used  and  defined  by  those  programs.  The  other  is  the  fixed  part,  or 
infrastructure,  supporting  creation,  execution,  and  change  to  the  constituents  of  the  variant  part.  The  major 
components  of  the  infrastructure  are  a  process  programming  language  and  interpreter,  object  management  sys¬ 
tem,  and  user  interface  management  system.  Process  programming  :  litates  precise  definition  and  automated 
support  of  software  development  and  maintenance  activities.  The  oi  :  management  system  provides  typing, 


350 


August  9, 1989 


relationships,  persistence,  distribution  and  concurrency  control  capabilities.  The  user  interface  management  sys¬ 
tem  mediates  communication  between  human  users  and  executing  processes,  providing  pleasant  and  uniform 
access  to  all  facilities  of  the  environment.  Research  in  each  of  these  areas  and  the  interaction  among  them  is 
described. 

(Teic77]  Abstract:  PSL/PSA  is  a  computer-aided  structured  documentation  and  analysis  technique  that  was 
developed  for,  and  is  being  used  for,  analysis  and  documentation  of  requirements  and  preparation  of  functional 
specifications  for  information  processing  systems.  The  present  status  of  requirements  definition  is  outlined  as  the 
basis  for  describing  the  problem  which  PSL/PSA  is  intended  to  solve.  The  basic  concepts  of  the  Problem  State¬ 
ment  Language  are  introduced  and  the  content  and  use  of  a  number  of  standard  reports  that  can  be  produced  by 
the  Problem  Statement  Analyzer  are  briefly  described. 

The  experience  to  date  indicates  that  computer-aided  methods  can  be  used  to  aid  system  development 
during  the  requirements  definition  stage  and  that  the  main  factors  holding  back  such  use  are  not  so  much  related 
to  the  particular  characteristics  and  capabilities  of  PSL/PSA  as  they  are  to  organizational  considerations 
involved  in  any  change  in  methodology  and  procedure. 

[TeitSl]  Abstract:  Interlisp  is  a  programming  environment  based  on  the  Lisp  programming  language.  In 
widespread  use  in  the  artificial  intelligence  community,  Interlisp  has  an  extensive  set  of  user  facilities,  including 
syntax  extension,  uniform  error  handling,  automatic  error  correction,  an  integrated  structure-based  editor,  a 
sophisticated  debugger,  a  compiler,  and  a  filing  system.  Its  most  popular  implementation  is  Interlisp-10,  which 
runs  under  both  the  Tenex  and  Tops-20  operating  systems  for  the  DEC  PDP-10  family.  Interlisp-10  now  has 
approximately  300  users  at  20  different  sites  (mostly  universities)  in  the  US  and  abroad.  It  is  an  extremely  well 
documented  and  well  maintained  system. 

Interlisp  has  been  used  to  develop  and  implement  a  wide  variety  of  large  application  systems.  Examples 
include  the  Mycin  system  for  infectious  disease  diagnosis,  the  Boyer-Moore  theorem  prover,  and  the  BBN 
speech  understanding  system. 

This  article  describes  the  Interlisp  environment,  the  facilities  available  in  it,  and  some  of  the  reasons  why 
Interlisp  developed  as  it  has. 

[Teit84]  Introduction:  This  paper  introduces  the  reader  to  many  of  the  salient  features  of  the  Cedar  Program¬ 
ming  Environment,  a  state-of-the-art  programming  system  that  combines  in  a  single  integrated  environment:  high 
quality  graphics,  a  sophisticated  editor  and  document  preparation  facility,  and  a  variety  of  tools  for  the  program¬ 
mer  to  use  in  the  construction  and  debugging  of  his  programs.  The  Cedar  programming  language  is  a  strongly- 
typed,  compiler-oriented  language  of  the  Pascal  family.  What  is  especially  interesting  about  the  Cedar  project  is 
that  it  is  one  of  the  few  examples  where  an  interactive,  experimental  programming  environment  has  been  built 
for  this  kind  of  language.  In  the  past,  such  environments  have  been  confined  to  dynamically  typed  languages  like 
Lisp  and  Smalltalk. 

The  paper  attempts  to  give  the  reader  the  feel  of  the  Cedar  system  by  emulating  a  live  demonstration.  The 
demonstration  is  actually  taken  from  a  video  tape  of  such  a  live  demo;  the  sequence  of  events,  as  well  as  the 
dialogue,  is  fairly  close  to  what  a  viewer  of  this  tape  would  see  and  hear.  Numerous  snapshots  of  the  display 
taken  at  various  points  during  the  session  simulate  the  visual  information  contained  in  the  tape.  Text  that  would 
actually  appear  on  the  display  during  the  demonstration  -  either  because  the  user  typed  it  or  the  system  printed  it 
-  will  appear  in  this  paper  in  a  distinguished  font.  The  explanations  that  the  demonstrator  would  give  will  be  in 
the  normal  font.  Comments  that  would  be  distracting  during  a  live  demonstrCion  but  are  appropriate  for  the 
paper  are  included  as  footnotes. 

(Thay75]  Abbreviated  Introduction:  The  need  for  improving  the  reliability  of  delivered  software  is  becoming 
increasingly  obvious  to  both  the  purchasers  and  producers  of  today’s  software  systems.  As  noted  by  Boehm,  the 
records  show  many  examples  of  software  systems  which,  when  delivered  for  operational  use,  either  performed  in 
a  degraded  fashion  or  failed  to  perform  at  all.  The  results  are  higher  software  costs  and  delays  in  operational 
usage. 
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In  a  study  being  performed  by  TRW  for  the  Rome  Air  Development  Center,  data  from  four  large  software 
systems  are  being  analyzed  to  determine  the  types  of  errors  found  in  software  during  testing.  The  objective  is 
principally  to  recommend  new  development  or  test  techniques  for  the  detection  and  prevention  of  software 
errors,  but  we  are  also  attempting  to  model  software  reliability.  In  the  course  of  supplying  real  data  descriptive  of 
software  reliability  and  for  model  evaluation,  we  have  had  to  determine  (1)  what  data  are  generally  available,  (2) 
methods  for  collecting  and  storing  these  data,  (3)  methods  for  describing  software  errors,  (4)  methods  for 
characterizing  the  software,  and  the  development  and  test  processes  in  quantitative  terms,  and  finally  (5) 
methods  of  analysis.  Although  the  projects  studied  have  varied  greatly  in  size,  language,  operating  mode,  and 
structure,  the  data  available  during  the  development  process  were  similar  for  each  project:  error  data,  recorded 
in  various  forms  of  software  problem  reports  (SPR)  and  ancillary  project  data  needed  to  understand  and  support 
analysis  of  the  error  data.  Although  the  data  were  not  generated  specifically  for  the  study,  we  found  that  we 
could  do  much  to  quantify  software  reliability  and  the  characteristics  of  the  software  itself,  as  well  as  improving 
our  understanding  of  both  the  software  and  the  development  process.  Some  results  of  the  Software  Reliability 
Study  will  be  presented  to  illustrate  the  benefits  of  software  reliability  data  collection  and  analysis.  Also 
presented  are  some  recommendations  for  identifying  data  that  need  collecting. 

[Thay80]  Abbreviated  Introduction:  Nearly  every  software  engineering  development  project  is  plagued  with 
numerous  problems  leading  to  late  delivery,  cost  overruns,  or  unsatisfied  customers.  Often,  these  problems  are 
technical.  However,  just  as  often,  they  are  managerial. 

Although  both  the  technological  and  managerial  aspects  of  software  engineering  were  recognized  at  about 
the  same  time,  improvements  and  developments  in  management  have  not  kept  pace  with  advances  in  the  tech¬ 
nology.  The  technology  of  software  engineering  as  a  well-defined  discipline  is  relatively  new;  however,  software 
engineering  has  progressed  to  the  point  where  many  major  issues  regarding  software  production  have  been  iden¬ 
tified,  and  considerable  progress  in  addressing  these  issues  has  been  made.  Practical  working  tools  to  support 
improved  production  are  commonly  available,  and  their  design  and  generation  have  become  a  recognized  topic 
for  university  instruction. 

Software  engineering  project  management  has  not  enjoyed  the  same  progress.  While  it  might  be  argued 
that  SEPM  has  been  defined,  it  is  far  from  a  recognized  discipline.  The  major  issues  and  problems  of  SEPM  have 
not  been  agreed  on  by  the  computing  community  as  a  whole,  and  consequently,  priorities  for  addressing  them 
have  not  been  widely  established.  Furthermore,  research  in  this  area  has  been  scant. 

[Theb84]  Abstract:  Recent  work  conducted  by  members  of  the  Purdue  Software  Metrics  Research  group  has 
focused  on  the  complexity  associated  with  coordinating  the  activities  of  persons  involved  in  large-scale  program¬ 
ming  efforts.  A  resource  model  is  presented  which  is  designed  to  reflect  the  impact  of  this  complexity  on  the 
economics  of  software  development.  The  model  is  based  on  a  formulation  in  which  development  effort  is  func¬ 
tionally  related  to  measures  of  product  size  and  manloading.  The  particular  formulation  used  is  meant  to  suggest 
a  logical  decomposition  of  development  effort  into  components  related  to  the  independent  programming  activity 
of  individuals  and  to  the  overhead  associated  with  the  required  information  flow  within  a  programming  team. 
The  model  is  evaluated  in  light  of  acquired  data  reflecting  a  large  number  of  commercially  developed  software 
products  from  two  separate  sources.  Additional  sources  of  data  are  actively  being  sought.  Although  strongly 
analytic  in  nature,  the  model’s  performance  is,  for  the  available  data,  at  least  as  good  in  accounting  for  the 
observed  variability  in  development  effort  as  some  highly  publicized  empirically  based  models  of  comparable 
complexity.  It  is  argued,  however,  that  the  model’s  principle  strength  lies  not  in  its  data  fitting  ability,  but  rather  in 
its  straightforward  and  intuitively  appealing  representation  of  relationships  involving  manpower,  time,  and  effort. 

[Thoin80]  Abstract:  This  paper  deals  with  the  statistics  of  estimating  the  software  reliability  of  complex  real-time 
systems  where  an  electronic  digital  computer  and  associated  computer  programs  are  essential  elements  of  system 
design  and  function.  Testing  is  conducted  in  the  operating  environment  or  a  simulated  environment  related  to  the 
operating  environment  in  some  way.  The  procedure  is  Bayesian  so  that  improvement  of  reliability  estimation  is 
realized  in  a  formal  and  convenient  way  as  more  and  more  test  data  are  accumulated.  The  method  provides  for 
estimating  a)  both  hardware  and  software  components  of  total  system  reliability  and  b)  Bayesian  interval  limits 
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using  existing  analytic  techniques  developed  by  the  authors  and  others.  The  results  apply  to  measurement  and 
prediction  of  reliability  performance,  to  acceptance  testing,  and  to  contractual  definition  and  implementation  of 
software  warranty  provisions  for  embedded  computer  systems. 

The  Bayesian  method  of  software-hardware  reliability  estimation  presented  here  exhibits  the  following 
unique  features: 

—  The  use  of  a  prior  p  on  the  probability  that  the  software  contains  errors.  This  prior  is  updated  as  test  failure 
data  are  accumulated.  Only  a  p  of  1  (software  known  to  contain  errors)  corresponds  to  a  case  already  treated 
in  the  literature. 

—  Hardware,  software,  and  unknown/ambiguous  source  failure  data  are  combined  to  yield  a  system  reliability 
estimation. 

—  A  decision-rule  treatment  is  developed  for  the  continuation  or  termination  of  testing  on  the  basis  of  specifica¬ 
tion  of  consumer  and  producer  risks  and  observed  test  results. 

[Tlch86]  Abstract:  With  current  compiler  technology,  changing  a  single  line  in  a  large  software  system  may 
trigger  massive  recompilations.  If  the  change  occurs  in  a  file  with  shared  declarations,  all  compilation  units 
depending  upon  that  file  must  be  recompiled  to  assure  consistency.  However,  many  of  those  recompilations  may 
be  redundant,  because  the  change  may  affect  only  a  small  fraction  of  the  overall  system. 

Smart  recompilation  is  a  method  for  reducing  the  set  of  modules  that  must  be  recompiled  after  a  change. 
The  method  determines  whether  recompilation  is  necessary  by  isolating  the  differences  among  program  modules 
and  analyzing  the  effect  of  changes.  The  method  is  applicable  to  languages  with  and  without  overloading.  A  pro¬ 
totype  demonstrates  that  the  method  is  efficient  and  can  be  added  with  modest  effort  to  existing  compilers. 

[Tisc83]  Abstract:  This  paper  describes  how  MAP,  a  tool  for  understanding  software,  combines  static  analysis, 
some  dynamic  features,  and  an  interactive  presentation  to  aid  programmers  in  debugging.  Static  analysis  of  the 
sort  produced  in  optimizing  compilers  could  provide  programmers  with  useful  information  that  they  cannot  get 
from  dynamic  debuggers.  The  challenge  for  designers  of  static  analysis  tools  is  to  present  the  information  in  a 
useful  form. 

[Triv80]  Abstract:  This  paper  addresses  the  problem  of  validating  the  reliability  of  computer  systems  used  in 
life-critical  applications.  Due  to  extremely  high  reliability  requirements,  traditional  validation  methods  based  on 
lifetesting  are  no  longer  applicable.  A  validation  approach  based  on  a  judicious  combination  of  logical  proofs, 
analytical  models,  and  experimental  testing  is  advocated.  The  role  of  Markov  reliability  models  in  the  validation 
process  is  discussed  and  a  taxonomy  of  validation  techniques  is  presented. 

[Troy81]  Abstract:  The  purpose  of  this  study  is  to  investigate  the  possibility  of  providing  some  useful  measures 
to  aid  in  the  evaluation  of  software  designs.  Such  measurements  should  allow  some  degree  of  predictability  in 
estimating  the  quality  of  a  coded  software  product  based  upon  its  design  and  should  allow  identification  and 
correction  of  deficient  designs  prior  to  the  coding  phase,  thus  providing  lower  software  development  costs.  The 
study  involves  the  identification  of  a  set  of  hypothesized  measures  of  design  quality  and  the  collection  of  these 
measures  from  a  set  of  designs  for  a  software  system  developed  in  industry.  In  addition,  the  number  of  modifica¬ 
tions  made  to  the  coded  software  that  resulted  from  these  designs  was  collected.  A  data  analysis  was  performed 
to  identify  relationships  between  the  measures  of  design  quality  and  the  number  of  modifications  made  to  the 
coded  programs.  The  results  indicated  that  module  coupling  was  an  important  factor  in  determining  the  quality 
of  the  resulting  product.  The  design  metrics  accounted  for  roughly  50-60%  of  the  variability  in  the  modification 
data,  which  supports  the  findings  of  previous  studies.  Finally,  the  weaknesses  of  the  study  are  identified  and  pro¬ 
posed  improvements  are  suggested. 

[Troy86]  Abstract:  Over  the  past  decade,  a  set  of  models  derived  from  the  application  of  conventional  reliability 
theory  to  software  engineering  has  been  proposed  with  regard  to  the  evaluation  of  program  reliability.  Observa¬ 
tions  of  operating  software  have  shown  that  these  models  are  not  sufficient  to  account  for  operational  reliability. 
This  limitation  requires  a  cautious  utilization  of  every  model:  each  reliability  evaluation  must  be  considered  as  a 
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special  case  which  must  be  based  upon  a  statistical  analysis  preceding  any  modeling.  This  implies  suitable 
methods  and  means  are  needed.  The  purpose  of  this  paper  is  to  propose  a  stepwise  statistical  methodology  for 
the  study  of  operating  system  reliability  and  associated  tools.  An  example  of  the  application  of  this  method  for 
the  ARGOS  center  of  CNES  is  presented. 

[TsaI86]  Abstract:  This  paper  describes  the  concepts,  functions  and  user  interface  of  the  tool  for  unit  test  con¬ 
struction  and  execution.  This  tool,  the  Interactive  Unit  Test  Facility  (IUTF),  addresses  some  of  the  major  con¬ 
cerns  in  the  unit  testing  process.  The  first  of  these  is  the  execution  of  the  unit  and  reporting  its  results  in  terms  of 
success/failure  and  coverage  measures.  The  other  concern,  sometimes  more  painful  and  time  consuming  for  the 
programmer,  is  the  preparation  and  maintenance  of  test  cases  for  execution. 

IUTF  performs  static  and  dynamic  testing  of  the  unit  provided  all  results  from  the  analysis  and  execution 
stage  are  stored  in  a  central  data  base.  The  important  design  notions  such  as  testing  environment,  scaffolding, 
and  test  drivers  and  construction  mechanisms  are  introduced  in  the  paper,  and  the  transformation  of  internal 
functions  of  the  unit  testing  tool  into  usable  and  consistent  interfaces  (via  the  predefined  screen  hierarchy)  is 
described. 

[TurnSO]  Summary:  Choosing  the  right  program  structures  can  lead  to  better  programs  and  modular  design  can 
make  large  programs  more  manageable.  This  paper  reviews  the  possible  structural  relationships  between  the 
modules  of  a  program  and  generates  a  tentative  morphology  of  program  structure  types.  It  concludes  that,  with 
some  exceptions,  the  hypothetical  pure  tree  structure  is  the  best  choice  for  most  data  processing  applications. 

[Ullm73]  Summary:  We  give  two  algorithms  for  computing  the  set  of  available  expressions  at  entrance  to  the 
nodes  of  a  flow  graph.  The  first  takes  0  (mri)  steps  on  a  program  flow  graph  (one  in  which  no  node  has  more  than 
two  successors),  where  n  is  the  number  of  nodes  and  m  the  number  of  expressions  which  are  ever  computed.  A 
modified  version  of  this  algorithm  requires  0  (»2)  steps  of  an  extended  type,  where  bit  vector  operations  are 
regarded  as  one  step.  We  present  another  algorithm  which  works  only  for  reducible  flow  graphs.  It  requires  0  (n 
log  n)  extended  steps. 

[Unde63]  Introduction:  When  a  scientific  program  is  to  be  used  by  physicists  as  an  aid  in  their  investigation,  the 
programmer  must  pay  careful  attention  to  the  problem  of  producing  the  program  in  a  suitable  form.  It  may  hap¬ 
pen  that  parts  of  the  program  which  the  programmer  may  prefer  to  regard  as  peripheral  activity,  such  as  input 
and  output  processes,  then  assume  a  major  importance  and  occupy  much  of  his  time  and  program;  the  numerical 
method  becomes  a  small  box  which  works  well  most  of  the  time  and  is  a  great  nuisance  when  it  doesn’t. 

The  problem  presented  by  the  construction  of  a  good  input  section  is  severe.  Best  efforts  to  date  fall  far 
short  of  perfection,  and  this  will  be  attained  only  when  the  experience  gained  by  use  of  a  program  is  stored,  not 
by  the  user  who  runs  problems  on  it,  but  within  the  program  itself,  ready  for  intelligence  use  by  the  program 
when  a  problem  is  presented  to  it. 

[Vale89]  Abstract:  The  practice  of  measuring  software  is  increasingly  seen  as  a  valuable  tool  in  the  overall 
development  of  high-quality  software  projects.  Software  measurement  attempts  to  use  known,  quantifiable, 
objective,  and  subjective  measures  to  compare  and  profile  software  projects  and  products.  To  compute  these 
measures  effectively,  data  'hat  characterize  the  software  project  and  product  are  needed.  This  paper  covers 
aspects  of  data  collection  and  software  measurement  as  they  have  bee  applied  by  one  particular  organization,  the 
Software  Engineering  Laboratory  (SEL).  The  measurement  results  include  the  experiences  and  lessons  learned 
through  numerous  experiments  conducted  by  the  SEL  on  nearly  60  flight  dynamics  software  projects.  These 
experiments  have  attempted  to  determine  the  effect  of  various  software  development  technologies  on  overall 
software  project  quality  and  on  specific  measures  such  as  productivity,  reliability,  and  maintainability. 

[Vali84]  Abstract:  Humans  appear  to  be  able  to  learn  new  concepts  without  needing  to  be  programmed  explicitly 
in  any  conventional  sense.  In  this  paper  we  regard  learning  as  the  phenomenon  of  knowledge  acquisition  in  the 
absence  of  explicit  programming.  We  give  a  precise  methodology  for  studying  this  phenomenon  from  a 
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computational  viewpoint.  It  consists  of  choosing  an  appropriate  information  gathering  mechanism,  the  learning 
protocol,  and  exploring  the  class  of  concepts  that  can  be  learned  using  it  in  a  reasonable  (polynomial)  number  of 
steps.  Although  inherent  algorithmic  complexity  appears  to  set  serious  limits  to  the  range  of  concepts  that  can  be 
learned,  we  show  that  there  are  some  important  nontrivial  classes  of  propositional  concepts  that  can  be  learned 
in  a  realistic  sense. 

[VemuSO]  Abstract:  Software  and  its  development  are  complex.  The  complexity  stems  from  the  multiplicity  of 
objectives  and  attributes  that  one  has  to  work  with  during  its  development.  Human  comprehension  of  multiple 
objectives  and  attributes  can  be  aided  by  displaying  the  relevant  data  on  a  two-dimensional  plane.  Several  display 
techniques,  and  in  particular  the  so  called  snowflakes  and  Chemoff  faces,  are  discussed  and  their  utility  in 
software  research  explored.  Examples  using  real  and  hypothetical  data  are  presented  to  illustrate  the  suitability 
of  these  pictures. 

[Veni89]  Abstract:  Because  Function  Point  Analysis  (FPA)  has  now  been  in  use  for  a  decade,  and  in  spite  of  its 
increasing  popularity  has  met  with  some  recent  criticisms,  it  is  time  to  review  how  appropriate  it  still  is  for 
today’s  technologies.  A  critical  review  of  the  FPA  approach  examines  in  particular  the  pioneering  and  continu- 
ing  work  of  Albrecht  and  more  recent  work  by  Symons.  Technological  dependencies  in  FPA-type  metric  for  a 
new  software  technology  is  given.  A  model  for  the  calibration  of  FPA-type  metrics  for  new  technologies  in  terms 
of  a  reference  technology  is  also  presented.  Such  calibration  is  essential  for  comparative  productivity  studies. 
The  role  of  module  estimation  in  exposing  parts  of  the  ‘anatomy'  of  the  FPA  approach  is  investigated.  The  | 
derivation  and  calibration  models  are  applied  to  a  significant  case  study  in  which  a  new  FPA-type  metric  suited  to 
a  particular  software  development  technology  is  derived,  calibrated  and  compared  with  other  published  versions 
of  FPA  metrics. 

[Vesa83]  Abstract:  An  empirical  study  of  447  operational  commercial  and  clerical  Cobol  programs  in  one  Aus¬ 
tralian  organization  and  two  U.S.  organizations  was  carried  out  to  determine  whether  program  complexity,  pro¬ 
gramming  style,  programmer  quality,  and  the  number  of  times  a  program  was  released  affected  program  repair 
maintenance.  In  the  Australian  organization  only  program  complexity  and  programming  style  were  statistically 
significant.  In  the  two  U.S.  organizations  only  the  number  of  times  a  program  was  released  was  statistically  signi¬ 
ficant.  For  all  organizations  repair  maintenance  constituted  a  minor  problem:  over  90  percent  of  the  programs 
studied  had  undergone  less  than  three  repair  maintenance  activities  during  their  lifetime. 

[Voge80]  Abstract:  This  paper  describes  the  automated  testing  tool  SADAT,  which  supports  the  testing  of  sin¬ 
gle  Fortran  modules.  The  different  functions  which  are  integrated  in  this  system  are  explained,  the  usage  of  the 
tool  is  demonstrated,  and  some  output  results  are  presented.  The  special  benefits  of  the  SADAT  system  are  sum¬ 
marized.  The  history  and  the  present  status  of  the  system  are  outlined.  Finally,  a  listing  of  further  reference 
material  and  information  on  the  program  availability  are  included. 

[Vosb84]  Abstract:  Fourteen  factors  that  influence  the  efficiency  of  programming  projects  were  identified  in  a 
corporate-wide  study  of  44  ITT  programming  projects  in  nine  countries.  Productivity  factors  were  classified 
according  to  project  management's  ability  to  control  them.  Product-related  factors  are  not  generally  under  the 
control  of  project  management.  They  describe  intrinsic  properties  of  the  programming  product  and  tend  to  place 
limitations  on  achievable  productivity.  Project-related  factors,  on  the  other  hand,  are  controllable  by  project 
management  to  varying  degrees.  These  factors  provide  real  opportunities  for  productivity  improvement.  The 
analysis  indicates  that  productivity  variation  is  almost  equally  attributable  to  product-related  and  project-related 
factors. 

[Vouk85c]  Abstract:  Software  fault  tolerance  mechanisms  commonly  used  today  suffer  from  their  inability  to 
successfully  cope  with  correlated  failures  of  components  of  a  fault-tolerant  software  (FTS)  system.  In  this  paper 
methods  for  computing  the  reference  and  observed  distributions  of  multiple  component  failures  (MCF’s)  of  a 
FTS  are  given.  A  MCF  of  category  k  refers  to  existence  of  a  test  case  for  which  exactly  k  components  of  a  FTS 
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system  fail.  The  reference  distribution  is  based  on  the  response  of  all  components  on  a  randomly  selected  test 
set,  and  the  assumption  that  the  conditional  intercomponent  responses  are  mutually  independent.  Identification 
of  correlated  failures  and  the  effectiveness  of  random  testing  for  detecting  correlated  failures  is  discussed 
through  comparison  of  the  reference  and  observed  MCF  distributions  for  an  experimental  FTS  system. 

[Youk86a]  Abstract:  A  major  weakness  of  software  fault  tolerance  mechanisms  commonly  discussed  today  is 
their  inability  to  cope  successfully  with  correlated  failures  of  components  of  a  fault-tolerant  software  (FTS)  sys¬ 
tem.  When  correlated  errors  are  present,  the  probability  that  a  FTS  system  fails  may  become  unacceptably  large. 
The  results  of  a  FTS  experiment  are  used  to  show  deficiencies  of  the  simple  random  testing  approach  in  the  con¬ 
text  of  FTS  testing.  Inter-version  failure  dependence  was  detected  in  the  experiment,  and  the  data  indicate  that 
in  high  reliability  FTS  components  a  considerable  percentage  of  correlated  failures  occur  in  the  domain  of 
extremal  or  special  input  values,  a  region  not  excited  by  simple  random  sampling  of  the  input  space.  The  use  of 
carefully  designed  test  cases  as  a  supplement  to  random  testing,  as  well  as  use  of  structure  based  testing  is  recom¬ 
mended. 

[Voiik86b]  Abstract:  Common  approaches  to  software  fault-tolerance  depend  on  redundancy  of  critical 
software  components.  Six  functionally  equivalent  programs  were  tested  with  specification  based  random  and 
extremal/special  value  (ESV)  test  cases.  Statement  and  branch  coverage  were  used  to  measure  and  compare  the 
attained  testing  effectiveness.  It  was  observed  that  both  measures  reached  a  nearly  steady  state  value  after  25  to 
75  random  test  cases.  Coverage  saturation  curves  appear  to  follow  an  exponential  growth  model.  However,  the 
steady  state  values  for  branch  coverage  of  different  components,  but  the  same  input  cases,  differed  by  as  much  as 
22%.  The  effect  is  the  result  of  the  differences  in  the  detailed  structure  of  the  components.  Improvement  in  cov¬ 
erage  provided  by  the  random  test  data,  after  the  ESV  cases  were  executed,  was  only  about  1% .  Results  indicate 
that  extensive  random  testing  can  be  a  process  of  diminishing  returns,  and  that  in  the  FTS  context  functional 
(“black  box”)  testing  can  provide  a  very  uneven  execution  coverage  of  the  functionally  equivalent  software,  and 
therefore  should  be  supplemented  by  structure  based  testing. 

[Wabl88]  Abstract:  As  a  consequence  of  timing  considerations,  program  execution  behavior  on  a  distributed 
system  may  not  be  reproducible  from  one  execution  to  the  next.  The  situation  is  exacerbated  when  the  architec¬ 
ture  of  the  distributed  system  is  non  von  Neumann,  as  in  the  case  of  a  dataflow  machine.  This  fact  has  implica¬ 
tions  for  the  testing  and  debugging  of  dataflow  programs.  In  this  paper  a  distributed  debugging  methodology  for 
dataflow  architectures  is  presented.  A  graphical  debugging  simulator  for  a  dataflow  machine  is  being  developed 
to  implement  this  methodology.  This  debugging  simulator  allows  the  user  to  debug  compiled  high-level  dataflow 
programs  written  for  the  machine.  The  ideas  of  the  debugging  methodology  are  outlined  and  the  debugging  simu¬ 
lator  is  described.  Special  emphasis  is  paid  to  the  multi-pass  feature  of  the  debugging  simulator  which  solves  the 
nonreproducibility  problem  of  distributed  debuggers  and  allows  the  user  to  execute  the  program  more  than  once 
with  the  identical  instruction  sequence  to  be  sure  that  a  fault  has  been  removed. 

[Waka89]  Abbreviated  Introduction:  Generally,  telecommunications  software  must  handle  a  complex,  large- 
scale  protocol  modeled  as  extended  finite-state  machines.  Much  research  has  been  done  on  how  to  specify 
telecommunications  software  with  formal  specification  languages.  However,  these  research  results  have  not  been 
completely  successful  for  three  main  reasons: 

1.  The  methods  devised  cannot  detect  errors  in  individual  finite-state  machines. 

2.  They  cannot  detect  protocol  errors,  such  as  missing  signal-reception  definitions,  in  large-scale  protocol  specif¬ 
ications. 

3.  They  do  not  have  functions  to  improve  the  legibility  of  manually  drafted  specifications. 

To  overcome  these  defects,  we  proposed  new  validation,  verification,  and  simplification  methods  for 
telecommunications  specifications.  At  Kokusai  Denshin  Denwa  Co.,  we  have  developed  a  prototype  specifica¬ 
tion  support  system,  Escort,  that  integrates  these  proposed  methods. 

[Wake88]  Abstract:  Computer  scientists  are  continually  attempting  to  improve  software  system  development. 
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Systems  are  developed  in  a  top-down  fashion  for  better  modularity  and  understandability.  Performance  enhance¬ 
ments  are  implemented  for  more  speed.  One  area  in  which  a  great  deal  of  effort  is  being  devoted  is  software 
maintenance.  Brooks  estimates  that  fifty  percent  of  the  development  costs  of  a  software  system  is  for  mainte¬ 
nance  activities.  Since  a  large  portion  of  the  effort  of  a  system  is  devoted  for  maintenance,  it  is  reasonable  to 
assume  that  driving  down  maintenance  costs  would  drive  down  the  overall  cost  of  the  system.  Measuring  the 
complexity  of  a  software  system  could  aid  in  this  attempt.  By  lowering  the  complexity  of  the  system  or  of  subsys¬ 
tems  within  the  system,  it  may  be  possible  to  reduce  the  amount  of  maintenance  necessary.  Software  quality 
metrics  were  developed  to  measure  the  complexity  of  software  systems.  This  study  relates  the  complexity  of  the 
system  as  measured  by  software  metrics  to  the  amount  of  maintenance  to  that  system.  We  have  developed  a 
model  which  uses  several  software  quality  metrics  as  parameters  to  predict  maintenance  activity. 

[WalkSl]  Preface:  The  purpose  of  the  Paradigmatic  Approach  is  to  provide  a  new  image  for  conceptualizing  the 
software  development  cycle.  It  is  believed  that  this  new  image  will  endanger  methodologies  that  predictably  pro¬ 
duce  reliable  software  systems.  This  text  is  not  a  cookbook  of  techniques.  It  does  not  attempt  to  direct  action 
through  prescribing  specific  behavior  patterns.  This  text  does,  however,  present  an  integrated  image  for  organiz¬ 
ing  behavior  and  a  universal  metric  for  evaluating  that  behavior.  The  reader  will  be  exposed  to  a  powerful  image, 
a  paradigm,  which  provides  an  integrated  perception  of  software  development.  This  paradigm  will  help  him  to 
organize  and  judge  technical  behavior  in  a  consistent  and  productive  manner.  The  consistent  behaviors  which 
result  from  paradigmatic  thinking  are  termed  “the  paradigmatic  approach”  and  will  facilitate  the  evolution  of 
software  management  from  a  craft  to  an  engineering  discipline. 

[Wall89]  Abbreviated  Introduction:  Verification  and  validation  is  one  of  the  software-engineering  disciplines 
that  help  build  quality  into  software.  V&V  is  a  collection  of  analysis  and  testing  activities  across  the  full  life  cycle 
and  complements  the  efforts  of  other  quality-engineering  functions.  This  overview  article  explains  what  V&V  is, 
shows  how  V&V  groups’  efforts  relate  to  other  groups’  efforts,  describes  how  to  apply  V&V,  and  summarizes 
evaluations  of  V&V  effectiveness. 

[Wals77a]  Overview:  Improvements  in  programming  technology  have  paralleled  improvements  in  computing 
system  architecture  and  materials.  Along  with  increasing  knowledge  of  the  system  and  program  development 
processes,  there  has  been  some  notable  research  into  programming  project  measurement,  estimation,  and  plan¬ 
ning.  Discussed  is  a  method  of  programming  project  productivity  estimation.  Also  presented  are  preliminary 
results  of  research  into  methods  of  measuring  and  estimating  programming  project  duration,  staff  size,  and  com¬ 
puter  cost. 

[Walt79]  Abstract:  With  the  increasing  complexity  of  the  software  systems  being  developed  today  and  the 
requirement  to  develop  them  within  a  short  schedule,  there  is  greater  emphasis  than  before  on  a  strong  quality 
management  program.  In  such  a  program  it  is  essential  that  we  know  how  to  specify  and  measure  software  quality 
so  that  we  can  ensure  the  system  meets  our  overall  life  cycle  objective.  It  is  important  not  only  from  the  system 
performance  point  of  view  but  also  in  cost.  ' 

The  role  of  the  manager  in  a  software  quality  program  is  important  throughout  the  entire  development 
phase  of  our  program.  The  impact  of  the  manager’s  decisions  during  this  phase  will  be  felt  not  only  during  opera¬ 
tion  and  maintenance  but  also  during  future  acquisitions  that  interface  with  the  system  or  that  incorporate  exist¬ 
ing  software  from  the  current  development. 

This  chapter  addresses  how  both  the  acquisition  manager  and  the  development  program  manager  can 
identify  which  quality  factors  are  important  and  how  metrics  of  these  factors  can  be  applied  in  the  software  qual¬ 
ity  management  program.  (The  acquisition  manager  and  the  development  program  manager  titles  throughout  this 
chapter  refer  to  the  organizations  rather  than  the  persons,  per  se.)  This  approach  of  applying  metrics  is  based 
upon  the  concept  of  software  quality  and  of  the  associated  metrics  described  in  the  previous  chapter. 

[Wamp85]  Abstract:  The  topic  of  this  thesis  is  the  development  of  and  the  results  obtained  from  a  system  which 
analyzes  Ada  tasking  programs  in  order  to  identify  potential  concurrency  related  programming  anomalies.  The 
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static  data  flow  used  in  this  thesis  are  described  in  great  detail  in  a  paper  by  Taylor. 

The  major  purpose  for  this  undertaking  was  to  characterize  the  actual  size  necessary  to  store  the  con¬ 
currency  related  information  fro m  some  sample  programs.  This  goal  was  deemed  appropriate  since  many  of  the 
techniques  used  in  the  analysis  process  have  already  been  shown  to  be  in  the  set  of  NP-complete  programs.  The 
method  employed  toward  this  end  was  to  build  a  prototype  version  of  the  Static  Concurrency  Analysis  system 
with  this  sufficient  instrumentation  in  order  to  gather  statistics  on  the  size  of  several  of  the  data  structures 
involved  in  the  process,  and  then  to  run  some  sample  programs  through  this  tool. 

It  will  be  seen  that  the  size  of  the  major  structure  built,  the  Concurrency  State  Graph  (CSG),  grows 
exponentially  in  the  number  of  tasks  that  exist  in  the  program  being  analyzed  for  all  example  programs  thus  far 
run  through  the  prototype  system.  This  CSG  structure  models  all  possible  tasking  related  program  states  that  a 
given  Ada  program  could  possibly  be  in  as  well  as  the  successor  and  predecessor  relationships  between  these 
states. 

The  SCA  system  currently  does  not  accept  the  frill  Ada  language  and  it  appears  to  be  a  less  than  trivial 
task  to  extend  the  prototype  system  so  that  all  Ada  tasking  programs  are  analyzable. 

fWam72]  Introduction:  With  the  billions  of  dollars  in  installed  computer  equipment  deployed  worldwide  and 
with  vast  sums  needed  to  operate,  program,  and  maintain  this  hardware,  computer  users  are  increasingly  aware 
of  the  need  to  improve  the  efficiency  of  data  processing  operations. 

Corporations  use  computers  to  handle  many  tasks.  The  range  of  tasks  and  size  of  the  computers  increase 
constantly.  But  corporations  do  not  know  how  to  evaluate  the  efficiency  of  their  data  processing  operations.  Ina¬ 
bility  to  make  an  accurate  assessment  has  kept  management  fearful  of  the  entire  data  processing  experience  and 
has  resulted,  generally,  in  a  hands-off  attitude.  This  tail-wags-the-dog  situation  places  a  particularly  heavy  burden 
on  that  portion  of  management  directly  responsible  for  the  data  processing  operation.  They  have  to  make  recom¬ 
mendations  on  new  and/or  added  equipment,  but  they  lack  objective  techniques  for  evaluating  the  DP  operation 
and  projecting  needs.  This  paper  reviews  some  of  the  issues  faced  and  alternative  techniques  for  assessing  sys¬ 
tem  performance. 

[Warr82]  Maintenance  of  software  is  a  major  problem  that  the  data  processing  industry  faces  today.  This  paper 
describes  MAP,  a  tool,  that  addresses  the  problems  of  software  maintenance  by  helping  programmers  to  under¬ 
stand  their  programs. 

[Wate79]  Abstract:  This  paper  presents  a  method  for  automatically  analyzing  loops,  and  discusses  why  it  is  a 
useful  way  to  look  at  loops.  The  method  is  based  on  the  idea  that  there  are  four  basic  ways  in  which  the  logical 
structure  of  a  loop  is  built  up.  An  experiment  is  presented  which  shows  that  this  accounts  for  the  structure  of  a 
large  class  of  loops.  The  paper  discusses  how  the  method  can  be  used  to  automatically  analyze  the  structure  of  a 
loop,  and  how  the  resulting  analysis  can  be  used  to  guide  a  proof  of  correctness  for  the  loop.  An  automatic  sys¬ 
tem  is  described  which  performs  this  type  of  analysis.  The  paper  discusses  the  relationship  between  the  structure 
building  methods  presented  and  programming  language  constructs.  A  system  is  described  which  is  designed  to 
assist  a  person  who  is  writing  a  program.  The  intent  is  that  the  system  will  cooperate  with  a  programmer 
throughout  all  phases  of  work  on  a  program  and  be  able  to  communicate  with  the  programmer  about  it. 

[Webe83]  Abstract:  This  paper  summarizes  techniques  for  designing  and  implementing  source-level  interactive 
debuggers  for  concurrent  programs.  Facilities  common  to  source-level  interactive  debuggers  have  been  adapted 
to  meet  the  needs  of  a  concurrent  programming  environment.  Of  special  interest  are  those  debugging  facilities 
which  allow  the  programmer  to  monitor  and  influence  the  execution  of  concurrent  processes. 

[Wegb74]  Abstract:  Current  methods  for  mechanical  program  verification  require  a  complete  predicate  specifi¬ 
cation  on  each  loop.  Because  this  is  tedious  and  error  prone,  producing  a  program  with  complete,  correct  predi¬ 
cates  is  reasonable  difficult  and  would  be  facilitated  by  machine  assistance.  This  paper  discusses  techniques  for 
mechanically  synthesizing  loop  predicates.  Two  classes  of  techniques  are  considered:  (1)  heuristic  methods 
which  derive  loop  predicates  from  boundary  conditions  and/or  partially  specified  inductive  assertions:  (2) 
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extraction  methods  which  use  input  predicates  and  appropriate  weak  interpretations  to  obtain  certain  classes  of 
loops  predicates  by  an  evaluation  on  the  weak  interpretation. 

[Wegb75]  Abstract:  One  means  of  analyzing  program  performance  is  by  deriving  closed-form  expressions  for 
their  execution  behavior.  This  paper  discusses  the  mechanization  of  such  analysis,  and  describes  a  system, 
Metric,  which  is  able  to  analyze  simple  Lisp  programs  and  produce,  for  example,  closed-form  expressions  for 
their  running  time  expressed  in  terms  of  size  of  input.  This  paper  presents  the  reasons  for  mechanizing  program 
analysis,  describes  the  operation  of  Metric,  explains  its  implementation,  and  discusses  its  limitations. 

[Wegfo76]  Abstract:  This  paper  is  concerned  with  proving  properties  of  programs  which  use  data  structures.  The 
goal  is  to  be  able  to  prove  that  all  instances  of  a  class  (e.g.,  as  defined  in  Simula)  satisfy  some  property.  A  method 
of  proof  which  achieves  this  goal,  generator  induction,  is  studied  and  compared  to  other  proof  rules  and 
methods:  inductive  assertions,  recursion  induction,  computation  induction,  and,  in  some  detail,  structural 
induction.  The  paper  concludes  by  using  generator  induction  to  prove  a  characteristic  property  of  an  implemen¬ 
tation  of  hashtables. 

[Wegb77]  Abstract:  Most  current  approaches  to  mechanical  program  verification  transform  a  program  and  its 
specifications  into  first-order  formulas  and  prove  these  formulas  valid.  Since  first-order  predicate  calculus  is  not 
decidable,  such  approaches  are  inherently  limited.  This  paper  proposes  an  alternative  approach  to  program 
verification:  correctness  proofs  are  constructively  established  by  proof  justification  written  in  algorithmic  nota¬ 
tion.  These  proof  justifications  are  written  as  part  of  the  program,  along  with  the  executable  code  and  correct¬ 
ness  specifications.  A  notation  is  presented  in  which  code,  specifications,  and  justifications  are  interwoven.  For 
example,  if  a  program  contains  a  specification  { exists  x  P(x)},  the  program  also  contains  a  justification  that  exhi¬ 
bits  the  particular  value  of  x  that  makes  P  true.  Analogously,  justifications  may  be  used  to  state  how  universally 
quantified  formulas  are  to  be  instantiated  when  they  are  used  as  hypotheses.  Programs  so  justified  may  be  verified 
by  proving  quantifier-free  formulas.  Additional  classes  of  justifications  serve  related  ends.  Formally,  justifica¬ 
tions  reduce  correctness  to  a  decidable  theory.  Informally,  justifications  establish  the  connection  between  execut¬ 
able  code  and  correctness  specifications,  documenting  the  reasoning  on  which  the  correctness  is  based. 

[Wegn79]  Introduction  and  Overview:  The  primary  purpose  of  this  book  is  to  provide  an  understandable  but 
nontrivial  description  by  active  research  workers  of  concepts  and  research  issues  in  principal  subareas  of 
software  technology.  It  should  be  useful  to  both  the  specialist  and  the  technical  layman  as  a  source  of  factual 
infr  rmation  about  research  issues  and  can  serve  as  a  starting  point  for  discussions  of  what  to  do  next.  We  hope  it 
will  make  practitioners  aware  of  the  practical  contributions  of  research,  make  researchers  aware  of  the  needs  of 
technology,  and  serve  to  stimulate  greater  collaboration  between  practitioners  and  research  workers.  An  even 
more  ambitious  objective  is  to  encourage  dialogue  among  research  workers  in  different  areas  (such  as  computer 
architecture,  programming  languages  and  data  base  management)  so  that  the  basis  for  an  integrated  approach  to 
computer  systems  can  be  established.  Last  but  not  least,  this  study  may  be  useful  to  funding  agencies  and  other 
policy-making  bodies  in  making  policy  decisions  concerning  future  support  of  research. 

The  first  four  chapters  (Part  I)  consider  the  nature  of  the  software  problem  and  describe  concepts  and 
tools  for  managing  large  software  systems.  The  remaining  sixteen  chapters  (Part  II)  describe  and  analyze  specific 
research  areas.  In  order  to  stimulate  discussion,  over  50  discussion  items  further  explore  specific  research  areas 
or  offer  novel  and  sometimes  controversial  points  of  view. 

[Weid86]  Introduction:  Most  of  the  work  in  tne  evaluation  of  software  development  environments  has  fallen  into 
one  of  three  categories.  First,  there  are  evaluations  of  particular  components  such  as  compilers,  editors,  or  win¬ 
dow  managers.  These  evaluations  are  useful  in  their  own  right,  but  they  fail  to  consider  global  aspects  of  the 
environment  or  how  components  interact.  Second,  there  are  evaluations  of  particular  environments.  These  stu¬ 
dies  usually  consider  the  tools  available  in  that  environment  but  they  do  not  lend  themselves  to  cross  environ¬ 
ment  comparisons.  Third,  there  are  lists  of  questions  and  criteria  without  the  details  of  how  to  answer  the  ques¬ 
tions  (  r  apply  the  criteria.  These  lists  are  useful,  but  are  frequently  difficult  to  apply  in  practice. 
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The  purpose  of  this  paper  is  to  address  the  shortcomings  of  the  above  approaches  by  providing  a  metho¬ 
dology  that  is  comprehensive,  repeatable,  extensible,  user-oriented,  and  partly  environment  independent.  This 
methodology  has  been  applied  to  several  Ada  environments  at  the  Software  Engineering  Institute  so  that  they 
may  be  compared  objectively  according  to  the  same  criteria.  This  paper  provides  the  requirements  for  an  effec¬ 
tive  environment  evaluation  methodology,  the  individual  steps  of  die  methodology,  and  an  example  of  how  the 
methodology  has  been  applied  in  practice. 

[Weln71]  Abbreviated  Preface:  This  book  has  only  one  major  purpose — to  trigger  the  beginning  of  a  new  field 
of  study:  computer  programming  as  a  human  activity,  or,  in  short,  the  psychology  of  computer  programming.  All 
other  goals  are  subservient  to  that  one.  What  [the  author  is]  trying  to  accomplish  is  to  have  the  reader  say,  upon 
finishing  the  book,  “Yes,  programming  is  not  just  a  matter  of  hardware  and  software.  [The  author]  shall  have  to 
look  at  things  in  a  new  way  from  now  on.” 

As  [the  author  hopes]  the  text  demonstrates  with  numerous  examples,  our  profession  suffers  under  an 
enormous  burden  of  myths  and  half-truths,  many  of  which  my  students  and  [the  author]  have  been  able  to  chal¬ 
lenge  with  extremely  simple  experiments.  But  our  resources  are  limited  and  the  problem  is  great.  There  are,  by 
various  estimates,  hundreds  of  thousands  of  programmers  working  today.  If  our  experiences  are  any  indication, 
each  of  them  could  be  functioning  more  efficiently,  with  greater  satisfaction,  if  he  and  his  manager  would  only 
learn  to  look  upon  the  programmer  as  a  human  being,  rather  than  as  another  one  of  the  machines. 

[The  author  thinks]  that  great  strides  are  possible  in  the  design  of  our  hardware  and  software  too,  if  we  can 
adopt  the  psychological  viewpoint.  [The  author]  would  hope  that  this  book  would  encourage  our  designers  to  add 
this  new  dimension  to  their  design  philosophy.  Not  that  the  few  ideas  and  speculations  in  this  book  will  give  them 
all  the  information  they  need;  but  hopefully  the  book  will  inspire  them  to  go  to  new  sources  for  information.  At 
the  moment,  programming-sophisticated  as  it  may  be  from  an  engineering  or  mathematical  point  of  view-is  so 
crude  psychologically  that  even  the  tiniest  insights  should  help  immeasurably.  My  own  experience,  and  the 
experience  of  my  students,  in  teaching,  learning,  and  doing  programming  with  psychological  issues  in  mind, 
bears  out  this  assertion.  [The  author  hopes]  each  of  [his]  readers  will  try  it  for  himself. 

[Weln80]  Abstract:  Traditional  cost/benefit  methods  in  the  risk  assessment  process  are  predicated  on  the 
occurrence  of  the  threat  and  the  resultant  loss  incurred.  Thus,  savings  can  be  obtained  only  if  the  threat  actually 
occurs.  A  more  appropriate  method  is  the  application  of  the  Bayesian  decision  model  in  the  analysis  of  the  cost- 
effectiveness  of  controls  to  improve  system  integrity.  The  applied  Bayesian  decision  model  is  specifically 
designed  for  cost/benefit  decisions  under  conditions  of  uncertainty  and  allows  for  the  calculation  of  the  benefit 
obtained  when  implementing  controls  for  a  threat  that  does  not  occur.  This  method  also  allows  for  the  calcula¬ 
tion  of  the  cost-effectiveness  of  taking  no  action,  that  is,  deciding  not  to  implement  any  control  against  an  identi¬ 
fied  threat. 

[Weis82]  Abstract:  Error  detection  and  error  correction  are  now  considered  to  be  the  major  cost  factors  in 
software  development.  Much  current  and  recent  research  has  been  devoted  to  finding  ways  to  prevent  software 
errors.  The  purpose  of  this  paper  is  to  compare  error  data  obtained  from  two  different  software-development 
environments  using  different  software-development  methodologies.  The  data  are  used  to  characterize  the  simi¬ 
larities  and  differences  in  the  environments  and  may  be  used  to  evaluate  the  success  with  which  different  metho¬ 
dologies  meet  the  claims  made  for  them.  Data  were  obtained  by  the  use  of  a  goal-directed  data-collection  process 
which  is  described  briefly.  A  key  feature  of  the  process  is  that  data  are  collected  and  validated  concurrently  with 
software  development.  Validation  often  involves  interviewing  the  programmers  supplying  the  data.  The  results 
are  data  distributions  across  characterizations,  such  as  effort  to  correct  error,  type  of  error,  locality  of  error.  The 
distributions  show  that  in  both  environments  the  principal  error  source  was  in  the  design  and  implementation  of 
single  routines.  Requirements  misunderstandings,  specifications  misunderstandings,  and  interface  misunder¬ 
standing  were  all  minor  sources  of  errors.  Few  errors  were  the  result  of  changes,  few  errors  required  more  than 
one  attempt  at  correction,  and  few  error  corrections  resulted  in  other  errors.  Most  errors  were  correctable  in  a 
day  or  less. 
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[Weis84]  Abstract:  Program  slicing  is  a  method  for  automatically  decomposing  programs  by  analyzing  their  data 
flow  and  control  flow.  Starting  from  a  subset  of  a  program’s  behavior,  slicing  reduces  that  program  to  a  minimal 
form  which  still  produces  that  behavior.  The  reduced  program,  called  a  “slice,”  is  an  independent  program 
guaranteed  to  represent  faithfully  the  original  program  within  the  domain  of  the  specified  subset  of  behavior. 

Some  properties  of  slices  are  presented.  In  particular,  finding  statement-minimal  slices  is  in  general 
unsolvable,  but  using  data  flow  analysis  is  sufficient  to  find  approximate  slices.  Potential  applications  include 
automatic  slicing  tools  for  debugging  and  parallel  processing  of  slices. 

[Weis85a]  Abbreviated  Introduction:  Empirically  comparing  structural  test  coverage  metrics  reveals  that  test 
sets  that  satisfy  one  metric  are  likely  to  satisfy  another  metric  as  well. 

[Weis85b]  Abstract:  A  definition  of  software  reliability  is  proposed  in  which  reliability  is  treated  as  a  generaliza¬ 
tion  of  the  probability  of  correctness  of  the  software  in  question.  The  definition  is  parameterized  by  the  distribu¬ 
tion  characterizing  the  operational  environment.  It  is  shown  that  the  definition  can  be  used  to  provide  many 
natural  models  of  reliability  by  varying  an  integer  parameter,  and  that  it  may  be  approximated  reasonably  using 
well-chosen  test  sets.  It  is  proved  that,  under  fairly  weak  conditions,  one  cannot  hope  to  measure  reliability 
exactly  by  using  finite  test  sets. 

[Weis85c]  Abstract:  An  effective  data  collection  methodology  for  evaluating  software  development  methodolo¬ 
gies  was  applied  to  five  different  software  development  projects.  Results  and  data  from  three  of  the  projects  are 
presented.  Goals  of  the  data  collection  include  characterizing  changes,  errors,  projects,  and  programmers,  iden¬ 
tifying  effective  error  detection  and  correction  techniques,  and  investigating  ripple  effects. 

The  data  collected  consists  of  changes  (including  error  corrections)  made  to  the  software  after  code  was 
written  and  baselined,  but  before  testing  began.  Data  collections  and  validations  were  concurrent  with  software 
development.  Changes  reported  were  verified  by  interviews  with  programmers.  Analysis  of  the  data  showed  pat¬ 
terns  that  were  used  in  satisfying  goals  of  the  data  collection.  Some  of  the  results  are  summarized  in  the  follow¬ 
ing- 

1.  Error  corrections  aside,  the  most  frequent  type  of  change  was  an  unplanned  design  modification. 

2.  The  most  common  type  of  error  was  one  made  in  the  design  or  implementation  of  a  single  component  of  the 
system.  Incorrect  requirements,  misunderstandings  of  functional  specifications,  interfaces,  support  software 
and  hardware,  and  languages  and  compilers  were  generally  not  significant  sources  of  errors. 

3.  Despite  a  significant  number  of  requirements  changes  imposed  on  some  projects,  there  was  no  corresponding 
increase  in  frequency  of  requirements  misunderstandings. 

4.  More  than  75%  of  all  changes  took  a  day  or  less  to  make. 

5.  Changes  tended  to  be  nonlocalized  with  respect  to  individual  components  but  localized  with  respect  to  subsys¬ 
tems. 

6.  Relatively  few  changes  resulted  in  errors.  Relatively  few  errors  required  more  than  one  attempt  at  correction. 

7.  Most  errors  were  detected  by  executing  the  program.  The  cause  of  most  errors  was  found  by  reading  code. 
Support  facilities  and  techniques  such  as  traces,  dumps,  cross-reference  and  attribute  listings,  and  program 
proving  were  rarely  used. 

[Weis86]  Abstract:  A  definition  of  software  reliability  is  proposed  in  which  reliability  is  treated  as  a  generaliza¬ 
tion  of  the  probability  of  correctness  of  the  software  in  question.  The  definition  is  parameterized  by  the  distribu¬ 
tion  characterizing  the  operational  environment,  and  by  a  tolerance  function  characterizing  a  notion  of  degree  of 
correctness.  It  is  shown  that  the  definition  can  be  used  to  provide  many  natural  models  of  reliability  by  varying 
the  tolerance  function,  and  that  it  may  be  reasonably  approximated  using  well-chosen  test  sets.  It  is  proved  that, 
under  fairly  weak  conditions,  one  cannot  hope  to  measure  reliability  exactly  by  using  finite  test  sets. 

[Weis88a]  Abstract:  Representing  a  concurrent  program  as  a  set  of  simulating,  sequential  programs  provides  a 
solution  to  the  reproducible  testing  problem  as  well  as  a  formal  foundation  for  a  theory  of  concurrent  program 
testing.  It  is  shown  how  this  model  of  concurrent  programs  is  used  to  extend  the  methods  and  theory  of  testing 
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sequential  programs  to  concurrent  programs. 

[Wcis88b]  Abstract:  A  definition  of  software  reliability  is  proposed  in  which  reliability  is  treated  as  a  generaliza¬ 
tion  of  the  probability  of  correctness  of  the  software  in  question.  A  tolerance  function  is  introduced  as  a  method 
of  characterizing  an  acceptable  level  of  correctness.  This  in  turn  is  used,  together  with  the  probability  function 
defining  the  operational  input  distribution,  as  a  parameter  of  the  definition  of  reliability  by  varying  the  tolerance 
function  and  that  it  may  be  reasonably  approximated  using  well-chosen  test  sets.  It  is  also  shown  that  there  is  an 
inherent  limitation  to  the  measurement  of  reliability  using  finite  test  set. 

[Weyu80c]  Abstract:  The  theory  of  test  data  selection  proposed  by  Goodenough  and  Gerhart  is  examined.  In 
order  to  extent  and  refine  this  theory,  the  concepts  of  a  revealing  test  criterion  and  a  revealing  subdomain  are 
proposed.  These  notions  are  then  used  to  provide  a  basis  for  constructing  program  tests. 

A  subset  of  a  program’s  input  domain  is  revealing  if  the  existence  of  one  incorrectly  processed  input 
implies  that  all  of  the  subset’s  elements  are  processed  incorrectly.  The  intent  of  this  notion  is  to  partition  the  pro¬ 
gram’s  domain  in  such  a  way  that  all  elements  of  an  equivalence  class  are  either  processed  correctly  or 
incorrectly.  A  test  set  is  then  formed  by  choosing  one  element  from  each  class.  This  process  represents  perfect 
program  testing.  For  a  practical  testing  strategy,  the  domain  is  partitioned  into  subdomains  which  are  revealing 
for  errors  considered  likely  to  occur. 

Three  programs  which  have  previously  appeared  in  the  literature  are  discussed  and  tested  using  the 
notions  developed  in  the  paper. 

[Weyu82]  Abstract:  A  frequently  invoked  assumption  in  program  testing  is  that  there  is  an  oracle  (i.e.,  the  tester 
or  an  external  mechanism  can  accurately  decide  whether  or  not  the  output  produced  by  a  program  is  correct).  A 
program  is  non-testable  if  either  an  oracle  does  not  exist  or  the  tester  must  expend  some  extraordinary  amount  of 
time  to  determine  whether  or  not  the  output  is  correct.  The  reasonableness  of  the  oracle  assumption  is  examined 
and  the  conclusion  is  reached  that  in  many  cases  this  is  not  a  realistic  assumption.  The  consequences  of  assum¬ 
ing  the  availability  of  an  oracle  are  examined  and  alternatives  investigated. 

[Weyu83]  Abstract:  Despite  the  almost  universal  reliance  on  testing  as  the  means  of  locating  software  errors  and 
its  long  history  of  use,  few  criteria  have  been  proposed  for  deciding  when  software  has  been  thoroughly  tested. 
As  a  basis  for  the  development  of  usable  notions  of  test  data  adequacy,  an  abstract  definition  is  proposed  and 
examined,  and  approximations  to  this  definition  are  considered. 

[Weyu84a]  Abbreviated  Introduction:  Rapps  and  Weyuker  introduced  a  family  of  test  data  selection  criteria 
based  on  data  flow  analysis  as  used  in  optimizing  compilers.  Most  test  data  selection  criteria  rely  solely  on  the 
program’s  control  flow  characteristics.  By  incorporating  data  flow  information  into  the  selection  procedure,  it  is 
possible  to  focus  on  associations  between  physically  disjoint  portions  of  the  program  which  are  related  by  the 
flow  of  data.  In  this  paper  we  determine  the  upper  bounds  on  the  amount  of  test  data  needed  to  satisfy  each  cri¬ 
terion,  and  thus  the  relative  difficulty  of  fulfilling  each.  We  call  such  an  upper  bound  the  complexity  of  the  cri¬ 
terion. 

[Weyu84b]  Abstract:  A  test  data  adequacy  criterion  is  a  set  of  rules  used  to  determine  whether  or  not  sufficient 
testing  has  been  performed.  A  general  axiomatic  theory  of  test  data  adequacy  is  developed,  and  five  previously 
proposed  adequacy  criteria  are  examined  to  see  which  of  the  axioms  are  satisfied.  It  is  shown  that  the  axioms  are 
consistent,  but  that  only  two  of  the  criteria  satisfy  all  of  the  axioms. 

[Weyu88a]  Abstract:  A  family  of  test  data  adequacy  criteria  employing  data  flow  information  has  been  previ¬ 
ously  proposed,  and  theoretical  complexity  analysis  performed.  This  paper  describes  an  empirical  study  to  help 
determine  the  actual  cost  of  using  these  criteria.  This  should  help  establish  the  practical  usefulness  of  these  cri¬ 
teria  in  testing  software,  and  serve  as  a  means  of  predicting  the  amount  of  testing  needed  for  a  given  program. 
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[Weyu89]  Overview:  Rebuttal  of  first  main  point  (inconsistent  evaluation  of  previously  defined  criteria)  is  based 
on  the  fact  that  definitions  used  in  the  original  paper  were  taken  directly  from  the  literature.  Rebuttal  of  second 
main  point  (lack  of  precision  and  formality)  revolves  around  assumption  of  a  software  engineer’s  understanding 
of  an  adequacy  criterion. 

[Whit78b]  Abstract:  This  paper  presents  a  testing  strategy  designed  to  detect  errors  in  the  control  flow  of  a  com¬ 
puter  program,  and  the  conditions  under  which  this  strategy  is  reliable  are  given  and  characterized.  The  control 
flow  statements  in  a  computer  program  partition  the  input  space  into  a  set  of  mutually  exclusive  domains,  each  of 
which  corresponds  to  a  particular  program  path  and  consists  of  input  data  points  which  cause  that  path  to  be  exe¬ 
cuted.  The  testing  strategy  generates  test  points  to  examine  the  boundaries  of  a  domain  to  detect  whether  a 
domain  error  has  occurred,  as  either  one  or  more  of  these  boundaries  will  have  shifted  or  else  the  corresponding 
predicate  relational  operator  has  changed.  If  test  points  can  be  chosen  within  e  of  each  boundary,  under  the 
appropriate  assumptions,  the  strategy  is  shown  to  be  reliable  in  detecting  domain  errors  of  magnitude  greater 
than  e.  Moreover,  the  number  of  test  points  required  to  test  each  domain  grows  only  linearly  with  both  the  dimen-  j 
sionality  of  the  input  space  and  the  number  of  predicates  along  the  path  being  tested. 

I 

[Whit80]  Abstract:  Many  current  software  development  methodologies  require  designers  to  select  design 
options  based  on  a  comparative  evaluation  of  the  merits  of  various  design  alternatives.  However,  techniques  for  I 
the  evaluation  and  continuous  monitoring  of  software  quality  lack  sufficient  development  or  generality  to  have 
achieved  widespread  acceptance.  While  there  is  much  interesting  work  addressing  the  assessment  of  software  j 
code  quality,  few  measurement  techniques  are  applicable  to  software  designs.  In  this  paper,  software  design  qual¬ 
ity  is  emphasized.  A  general  formalism  for  expressing  software  designs  is  presented,  and  two  metrics  of  design 
quality,  as  functions  of  control  flow  and  data  flow  complexity,  are  proposed. 

[WhitiM]  Abstract:  An  automated  testing  approach  called  the  Domain  Testing  Strategy  has  been  developed  to 
primarily  detect  errors  in  software  control  flow,  though  it  may  also  detect  errors  in  computation.  Detection  of 
control  flow  errors  is  accomplished  by  determining  that  the  domain  boundary  is  correct  within  an  acceptable 
error  bound.  An  analysis  of  this  strategy  has  appeared  in  the  literature  which  identifies  those  conditions  under 
which  this  error  bound  is  not  acceptable,  and  methods  were  proposed  to  select  test  points  which  achieved  a 
reduced  error  bound.  It  is  the  objective  of  this  paper  to  provide  an  alternative  measure  of  error  bound  which  is 
more  easily  calculated,  and  an  heuristic  method  for  selecting  test  points.  Although  the  use  of  this  measure  and 
test  selection  method  will  not  result  in  the  same  level  of  reduction  in  error  bounds  in  Domain  Testing  as  those 
previously  proposed,  it  is  argued  that  the  reduction  in  effort  can  justify  this  alternative  approach. 

[Whit88a]  Abstract:  One  of  the  serious  limitations  of  domain  testing  is  the  potentially  infinite  number  of 
domains  to  be  examined  in  the  presence  of  iteration  loops  in  the  computer  program.  The  purpose  of  this  paper  is 
to  show  that  only  a  small  number  of  domains  need  to  be  examined,  and  that  one  can  concentrate  on  testing  cer¬ 
tain  borders  of  those  domains.  It  is  first  shown  that  for  definite  loops,  where  the  number  of  iterations  is  known 
upon  entry,  iteration  loops  can  be  represented  by  a  primitive  recursive  schema.  This  involves  the  identification  of 
simple  loop  patterns,  and  it  is  proven  that  these  simple  loop  patterns  can  be  used  as  basic  building  blocks  to  form 
arbitrarily  complex  loop  patterns.  It  is  further  shown  that  domain  testing  can  be  adapted  to  test  these  simple 
loop  patterns,  which  precludes  the  necessity  of  having  to  test  any  of  the  complex  loop  patterns.  A  bound  is 
obtained  on  the  number  of  loop  patterns  that  have  to  be  tested  and  worst  cases  identified  for  the  corresponding 
control  flow  graphs:  for  loop  patterns  from  the  perspective  of  the  exit  node,  for  loop  patterns  required  to  test  all 
predicate  nodes,  and  for  loop  patterns  required  to  test  all  final  predicate  nodes.  The  paper  concludes  with  some 
recommendations  for  those  simple  loop  patterns  which  should  be  selected  first  for  testing  in  order  to  provide 
greatest  information  about  errors  in  the  program,  and  identifies  some  problems  for  future  research. 

[Whit88b]  Position  Statements  Included: 

•  Gensheimer,  E.L.  “Technology  Transfer  in  the  Product  Verification/Quality  Areas.” 
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•  Good,  D.I.  “Transferring  Testing  and  Verification  Technology  to  Industry.” 

•  Hennell,  M.A.  “Technology  Transfer.” 

•  Miller,  E.  “Testing  and  Verification  Problems  in  Industry:  Technology  Transfer.” 

•  Sneed,  H.M.  “State  Coverage  of  Embedded  Real-time  Programs.” 

[Wien84]  Abstract:  In  this  paper,  a  formal  model  of  the  software  manloading  pattern,  the  Rayleigh  model,  is 
described  and  then  applied  to  four  Bankers  Trust  Company  (BTCo.)  new  development  projects  processing  com¬ 
plete  life  cycle  manloading  data  (maintenance  phase  included).  To  fit  the  Rayleigh  curve  to  a  project’s  manload¬ 
ing  scores,  (nonlinear)  regression  was  used  to  obtain  least  squares  estimates  of  the  Rayleigh  parameters,  which, 
in  turn,  were  used  to  generate  the  Rayleigh  manloading  curve.  For  all  four  projects,  deviation  from  the  Rayleigh 
curve  was  small  and  constant  throughout  the  software  development  phases  (i.e.,  preliminary  design  through 
implementation);  however,  the  Rayleigh  curve  consistently  deviated  from  the  actual  manloading  during  system 
maintenance,  underestimating  the  amount  of  maneffort  expended.  Restricting  maintenance  maneffort  to  man¬ 
power  expended  on  repair  of  system  faults  (“corrective”  maintenance)  resulted  in  a  single  Rayleigh  curve  that 
could  be  applied  over  the  entire  BTCo.  life  cycle.  Furthermore,  this  corrective  portion  of  the  maintenance  effort 
could  be  accurately  forecasted  from  the  Rayleigh  curve  fit  to  software  development.  Implications  of  these  find¬ 
ings  for  software  management  are  discussed. 

[Wild88]  Overview:  Current  logic  programming  systems,  as  typified  by  PROLOG,  contain  limitations  which  res¬ 
trict  their  usefulness  during  the  specification,  design  and  testing  of  software.  A  major  limitation  is  the  inability  to 
perform  analysis  in  the  presence  of  incomplete  information.  Three  sources  of  incompleteness  are  discussed 
here.  First,  the  analysis  is  incomplete  because  the  system  is  only  partially  finished.  Second,  in  order  to  provide 
overall  guidance,  the  analysis  is  first  performed  at  an  abstract  level.  The  abstraction  can  be  done  selectively  in 
order  to  focus  the  analysis.  Third,  some  forms  of  incompleteness  can  only  be  resolved  at  run  time  by  examining 
the  properties  of  objects  which  are  determined  dynamically.  It  is  the  program  itself  which  resolves  the  last  form 
of  incompleteness. 

In  logic  programming,  the  program  is  expressed  in  terms  of  predicates  relating  the  input  and  output  and 
execution  proceeds  by  constructing  a  proof  of  these  input/output  relationships.  Generic  Constraint  Logic  Pro¬ 
gramming  is  n  form  of  logic  programming  developed  to  address  incompleteness  in  analysis. 

[Wile88]  Abstract:  Defining,  creating,  and  manipulating  persistent  typed  objects  will  be  central  activities  in  future 
software  environments.  PGRAPHlTE  is  a  working  prototype  through  which  we  are  exploring  the  requirements 
for  the  persistent  object  capability  of  an  object  management  system  in  the  Arcadia  software  environment. 

PGRAPHlTE  represents  both  a  set  of  abstractions  that  define  a  model  for  dealing  with  persistent  objects 
in  an  environment  and  a  set  of  implementation  strategies  for  realizing  that  model.  PGRAPHlTE  currently  pro¬ 
vides  a  type  definition  mechanism  for  one  important  class  of  types,  namely  directed  graphs,  and  the  automatic 
generation  of  Ada  implementations  for  the  defined  types,  including  their  persistence  capabilities. 

We  present  PGRAPHlTE,  describe  and  motivate  its  model  of  persistence,  outline  the  implementation 
strategies  that  it  embodies,  and  discuss  some  of  our  experience  with  the  current  version  of  the  system. 

[WUI79]  Abstract:  In  languages  such  as  Pascal,  the  programmer  can  arrange  to  have  the  compiler  check  such 
things  as  the  range  of  the  value  of  a  variable  only  by  defining  a  new  type  or  sub-type.  [The  author  has]  investigated 
how  more  powerful  checking  facilities  might  be  provided  if  they  were  divorced  from  the  type  machinery,  and  also 
if  the  necessary  language  constructs  were  designed  independent  of  what  any  particular  compiler  would  check  at 
compile-time. 

[Wing89]  Abstract:  Toward  the  overall  goal  of  putting  formal  specifications  to  practical  use  in  the  design  of  large 
systems,  we  explore  the  combination  of  two  specification  methods:  using  temporal  logic  to  specify  concurrency 
properties  and  using  an  existing  specification  language,  Ina  Jo,  to  specify  functional  behavior  of  nondeterministic 
systems.  In  this  paper,  we  give  both  informal  and  formal  descriptions  of  both  current  Ina  Jo  and  Ina  Jo  enhanced 
with  temporal  logic.  We  include  details  of  a  simple  example  to  demonstrate  the  use  of  the  proof  system  and 


364 


August  9, 1989 


details  of  an  extended  example  to  demonstrate  the  expressiveness  of  the  enhanced  language.  We  discuss  at  length 
our  language  design  goals,  decisions,  and  their  implications.  The  Appendix  contains  a  list  of  axioms,  rules  of 
inference,  derived  rules,  and  theorem  schemata  for  the  enhanced  formal  system. 

[Wirs83]  Summary:  Hierarchical  abstract  data  types  are  algebraic  specifications  of  computation  structures 
where  certain  sorts,  function  symbols,  and  axioms  are  designated  as  being  primitive.  On  hierarchical  abstract 
data  types  additional  structure  is  imposed.  An  algebraic  specification  is  thus  decomposed  into  several  well- 
separated  levels,  such  that  both  the  understanding  and  the  independent  implementation  of  the  levels  is  sup¬ 
ported.  This  paper  provides  both  model-theoretic  and  deduction-oriented  conditions  guaranteeing  the  soundness 
of  a  hierarchical  specification.  Furthermore  necessary  and  sufficient  conditions  for  the  existence  of  initial  and 
terminal  models  are  investigated,  and  their  close  connection  to  the  soundness  of  a  hierarchy  is  demonstrated.  In 
order  to  provide  freedom  and  flexibility  for  specifications  a  wide  class  of  axioms  -  namely  universal-existential 
formulas  -  are  admitted. 

[Wolf85a]  Abstract:  The  Ada  programming  language  is  intended  for  the  implementation  of  large  and  complex 
software  systems.  Such  systems  often  exceed  a  half-million  lines  of  code;  if  their  developers  adhere  to  the 
software  engineering  maxim  that  no  module  should  contain  more  than  SO  lines  of  code,  then  the  number  of 
modules  in  such  systems  will  exceed  10,000!  As  DeRemer  and  Kron  point  out,  dealing  with  a  “large  collection  of 
modules  to  form  a  ‘system’  is  an  essentially  distinct  and  different  intellectual  activity  from  that  of  constructing 
the  individual  modules.”  Thus,  developers  and  maintainers  of  large  Ada  systems  will  require  tools  beyond  the 
syntax-directed  editors,  compilers,  debuggers  and  so  on  needed  for  “programming-in-the-small.”  They  will  need 
extensive  support  for  describing,  analyzing,  organizing,  and  managing  the  modules  in  a  system-that  is,  an 
environment  for  “programming-in-the-large.” 

[Wolf86c]  Abstract:  Despite  the  importance  of  describing  and  analyzing  the  relationships  among  a  software  sys¬ 
tem’s  components,  most  languages  and  development  environments  do  not  provide  suitable  support  for  these 
activities.  While  Ada  and  the  various  existing  Ada  environments  offer  some  assistance,  the  capabilities  they  offer 
are  inadequate  for  use  in  truly  large  and  complex  software  development  projects.  To  address  these  shortcomings, 
we  are  developing  the  AdaPIC  toolset,  which  we  envision  as  an  important  component  of  an  Ada  software 
development  environment.  The  AdaPIC  toolset  is  one  particular  instantiation,  specifically  adapted  for  use  with 
Ada,  of  the  more  general  collection  of  language  features  and  analysis  capabilities  that  constitute  the  PIC 
approach  to  describing  and  analyzing  relationships  among  software  system  components.  This  toolset  is  being 
tailored  to  support  an  incremental  approach  to  the  interface  control  aspects  of  the  software  development  pro¬ 
cess.  Following  a  discussion  of  the  interface  control  and  incremental  development  concepts,  this  paper  describes 
the  AdaPIC  toolset,  concentrating  on  its  analysis  tools  and  support  for  incremental  development  and  demon¬ 
strating  how  it  contributes  to  the  technology  for  developing  large  Ada  software  systems. 

[Wolv74]  Abstract:  The  work  of  software  cost  forecasting  falls  into  two  parts.  First  we  make  what  we  call  struc¬ 
tural  forecasts,  and  then  we  calculate  the  absolute  doPar  volume  forecasts.  Structural  forecasts  describe  the 
technology  and  function  of  a  software  project,  but  not  its  size.  We  allocate  resources  (costs)  over  the  project’s 
life  cycle  from  the  structural  forecasts.  Judgement,  technical  knowledge,  and  econometric  research  should  com¬ 
bine  in  making  the  structural  forecasts.  A  methodology  based  on  a  25  x  7  structural  forecast  matrix  that  has  been 
used  by  TRW  with  good  results  over  the  past  few  years  is  presented  in  this  paper.  With  the  structural  forecasts  in 
hand,  we  go  on  to  calculate  the  absolute  dollar-volume  forecasts.  The  general  logic  followed  in  “absolute”  cost 
estimating  can  be  used  on  either  a  mental  process  or  an  explicit  algorithm.  A  cost  estimating  algorithm  is 
presented  and  five  traditional  methods  of  software  cost  forecasting  are  described:  top-down  estimating,  similari¬ 
ties  and  difference  estimating,  ratio  estimating,  standards  estimating,  and  bottom-up  estimating.  All  forecasting 
methods  suffer  from  the  need  for  a  valid  cost  data  base  for  many  estimating  situations.  Software  information  ele¬ 
ments  that  experience  has  shown  to  be  useful  in  establishing  such  a  data  base  are  given  in  the  body  of  the  paper. 
Major  pricing  pitfalls  are  identified.  Two  case  studies  are  presented  that  illustrate  the  software  cost  forecasting 
methodology  and  historical  results.  Topics  for  further  work  and  study  are  suggested. 
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[Wood79a]  Abstract:  This  paper  discusses  the  need  for  measures  of  complexity  and  of  unstructuredness  of  pro¬ 
grams.  A  simple  language  independent  concept  is  put  forward  as  a  measure  of  control  flow  complexity  in  pro¬ 
gram  text  and  is  then  developed  for  use  as  a  measure  of  unstructuredness.  The  proposed  metric  is  compared  with 
other  metrics,  the  most  notable  of  which  is  the  cyclomatic  complexity  measure.  Some  experience  with  automatic 
tools  for  obtaining  these  metrics  is  reported. 

[Wood79b]  Abstract:  The  effect  of  a  variation  in  problem  complexity  and  how  the  variation  relates  to  program¬ 
ming  complexity  is  predicted  and  measured.  An  experiment  was  conducted  in  which  eighteen  graduate  students 
programmed  two  variations  of  the  same  small  algorithm  where  the  problem  complexity  varied  by  25  percent. 
Eight  measurable  program  characteristics  are  compared  with  predicted  values  obtained  using  only  two  known 
parameters.  The  agreement  between  observed  and  predicted  values  is  very  good.  Both  predicted  and  observed 
measurements  indicate  that  the  25  percent  increase  in  problem  complexity  results  in  a  100  percent  increase  in 
programming  complexity. 

[Wood80b]  Abstract:  There  are  a  number  of  practical  difficulties  in  performing  a  path  testing  strategy  for  com¬ 
puter  programs.  One  problem  is  in  deciding  which  paths,  out  of  a  possible  infinity,  to  use  as  test  cases.  A  hierar¬ 
chy  of  structural  test  metrics  is  suggested  to  direct  the  choice  and  to  monitor  the  coverage  of  test  paths.  Another 
problem  is  that  many  of  the  chosen  paths  may  be  unfeasi’  in  the  sense  that  no  test  data  can  ever  execute  them. 
Experience  with  the  use  of  “allegations”  to  circumvent  tnis  problem  and  prevent  the  static  generation  of  many 
unfeasible  paths  is  reported. 

[Wood81a]  Abstract:  As  the  cost  of  programming  becomes  a  major  component  of  the  cost  of  computer  sys¬ 
tems,  it  becomes  imperative  that  program  development  and  maintenance  be  better  managed.  One  measurement 
a  manager  could  use  is  programming  complexity.  Such  a  measure  can  be  very  useful  if  the  manager  is  confident 
that  the  higher  the  complexity  measure  is  for  a  programming  project,  the  more  effort  it  takes  to  complete  the  pro¬ 
ject  and  perhaps  to  maintain  it.  Until  recently  most  measures  of  complexity  were  based  only  on  intuition  and 
experience.  In  the  past  3  years  two  objective  metrics  have  been  introduced,  McCabe’s  cyclomatic  number  v(G) 
and  Halstead’s  effort  measure  E.  This  paper  reports  an  empirical  study  designed  to  compare  these  two  metrics 
with  a  classic  size  measure,  lines  of  code.  A  fourth  metric  based  on  a  model  of  programming  is  introduced  and 
show  to  be  better  than  the  previously  known  metrics  for  some  experimental  data. 

[Wood8lb]  Abstract:  An  experiment  was  conducted  to  investigate  how  different  types  of  modularization  and 
comments  are  related  to  programmers’  ability  to  understand  programs.  Forty-eight  experienced  programmers 
were  given  eight  different  versions  of  the  same  program  and  asked  to  answer  a  twenty  question  quiz  used  to  meas¬ 
ure  comprehension.  These  eight  different  versions  were  the  result  of  the  program  being  constructed  with  four 
types  of  modularization  (monolithic,  functional,  super,  and  abstract  data  type),  each  with  and  without  com¬ 
ments.  Those  subjects  whose  programs  contained  comments  were  able  to  answer  more  questions  than  those 
without  comments.  Also,  those  subjects  who  were  given  the  abstract  data  type  version  of  the  program  were  able 
to  do  significantly  better  than  those  with  any  other  type  of  modularization. 

[Wood81c]  Overview:  In  order  to  improve  upon  the  complexity  measurement  results  obtained  using  the  LOC, 
McCabe’s  cyclomatic  number,  and  Halstead’s  software  science  effort  measure,  the  authors  have  developed 
another  model  for  programming  complexity.  This  measure  includes  consideration  of  control,  data,  and  implicit 
module  interconnections. 

fWood88]  Abstract:  Despite  the  intrinsic  appeal  of  the  mutation  approach  to  testing,  its  disadvantage  in  being 
computationally  expensive  has  hampered  its  widespread  acceptance.  When  weak  mutation  was  introduced  as  a 
less  expensive  and  less  stringer  >rm  of  mutation  testing,  the  original  technique  was  renamed  strong  mutation. 
This  paper  argues  that  strong  mutation  testing  and  weak  mutation  testing  are  in  fact  extreme  ends  of  a  spectrum. 
The  term  firm  mutation  is  introduced  here  to  represent  the  middle  ground  in  this  spectrum.  This  paper  also 
argues,  by  means  of  a  number  of  small  examples,  that  there  is  a  potential  problem  concerning  the  criterion  for 
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deciding  whether  a  mutant  is  ‘dead’  or  ‘live’.  A  variety  of  solutions  are  suggested.  Finally,  practical  considera¬ 
tions  for  a  firm  mutation  testing  system,  with  greater  user  control  over  the  nature  of  result  comparison,  are  dis¬ 
cussed.  Such  a  system  is  currently  under  development  as  part  of  an  interpretive  development  environment. 

[Wu87c]  Abstract:  Coverage  metrics  have  traditionally  been  used  to  evaluate  the  effectiveness  of  procedures  for 
testing  software  systems.  In  practice,  however,  the  metrics  are  heavily  influenced  by  the  characteristics  of  tradi¬ 
tional  programming  languages  such  as  Fortran  and  Pascal.  Languages  such  as  Ada  differ  from  traditional 
languages  to  such  an  extent  that  it  is  necessary  to  develop  new  metrics. 

This  paper  proposes  a  number  of  coverage  measures  for  Ada  features  such  as  packages,  generic  units, 
and  tasks,  and  discusses  their  interpretations  in  relation  to  the  traditional  coverage  metrics.  It  also  proposes  a 
mechanism  for  collecting  these  coverage  measures.  In  addition,  it  suggests  that  coverage  metrics  may  also  be 
interpreted  as  indicators  of  dynamic  system  performance. 

[Wu88]  Abstract:  Program  mutation  is  a  suitable  technique  for  investigating  software  reliability  and  quality  con¬ 
trol  since  it  is  able  to  detect  many  potential  errors.  However  it  is  necessary  to  improve  the  technique  for  indus¬ 
trial  practice.  A  new  method  of  program  mutation  is  presented  here  which  increases  the  feasibility,  effectiveness 
and  efficiency  of  searching  for  those  errors  which  have  escaped  from  the  activities  of  Afterers  and  competent 
programmers.  It  is  based  on  syntax  direction  and  it  is  aided  by  the  language  semantics.  This  means  that  the  scope 
of  a  program  mutation  (i.e.  the  mutation  rules  of  the  method),  and  its  corresponding  mutants,  are  rigorously 
directed  by  a  syntax  and  related  semantics  as  defined  by  the  tester.  A  paradigm  for  the  mutation  syntax  and 
semantics  when  limited  to  boolean  expressions  and  the  corresponding  test  coverage  metrics  in  terms  of  this 
method  are  given  in  the  paper. 

[Wulf76]  Abstract:  The  programming  language  Alphard  is  designed  to  provide  support  for  both  the  methodolo¬ 
gies  of  “well-structured”  programming  and  the  techniques  of  formal  program  verification.  Language  constructs 
allow  a  programmer  to  isolate  an  abstraction,  specifying  its  behavior  publicly  while  localizing  knowledge  about  its 
implementation.  The  verification  of  such  an  abstraction  consists  of  showing  that  its  implementation  behaves  in 
accordance  with  its  public  specifications;  the  abstraction  can  then  be  used  with  confidence  in  constructing  other 
programs,  and  the  verification  of  that  use  employs  only  the  public  specifications. 

This  paper  introduces  Alphard  by  developing  and  verifying  a  data  structure  definition  and  a  program  that 
uses  it.  It  shows  how  each  language  construct  contributes  to  the  development  of  the  abstraction  and  discusses  the 
way  the  language  design  and  the  verification  methodology  were  tailored  to  each  other.  It  serves  not  only  as  an 
introduction  to  Alphard,  but  also  as  an  example  of  the  symbiosis  between  verification  and  methodology  in 
language  design.  The  strategy  of  program  structuring,  illustrated  for  Alphard,  is  also  applicable  to  most  of  the 
“data  abstraction”  mechanisms  now  appearing. 

[Yama83]  Abstract:  A  stochastic  model  for  a  software  error  detection  process  in  which  the  growth  curve  of  the 
number  of  detected  software  errors  for  the  observed  data  is  S-shaped  is  investigated.  The  software  error  detec¬ 
tion  model  is  a  nonhomogeneous  poisson  process  where  the  mean-value  function  has  an  S-shaped  growth  curve. 
The  model  is  applied  to  actual  software  error  data,  and  the  maximum-likelihood  estimates  (MLES)  of  the 
unknown  parameters  and  the  related  quantitative  indices  are  obtained.  Statistical  inference  on  the  unknown 
parameters  is  discussed.  Comparison  with  other  models  indicates  that  the  model  presented  fits  the  observed  data 
better  than  other  models. 

[Yau78]  Abstract:  Maintenance  of  large-scale  software  systems  is  a  complex  and  expensive  process.  Large-scale 
software  systems  often  possess  both  a  set  of  functional  and  performance  requirements.  Thus,  it  is  important  for 
maintenance  personnel  to  consider  the  ramifications  of  a  proposed  program  modification  from  both  a  functional 
and  a  performance  perspective.  In  this  paper  the  ripple  effect  which  results  as  a  consequence  of  program  modifi¬ 
cation  will  be  analyzed.  A  technique  is  developed  to  analyze  this  ripple  effect  from  both  functional  and  perfor¬ 
mance  perspectives.  A  figure-of-merit  is  then  proposed  to  estimate  the  complexity  of  program  modification.  This 
figure  can  be  used  as  a  basis  upon  which  various  modifications  can  be  evaluated. 
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[Yau79]  Abstract:  Software  maintenance  has  been  the  dominant  factor  contributing  to  the  high  cost  of  software. 
In  this  paper,  the  software  maintenance  process  and  the  important  software  quality  attributes  that  affect  the 
maintenance  effort  are  discussed.  Among  these  quality  attributes,  the  stability  of  a  program,  which  indicates  the 
resistance  to  the  potential  ripple  effect  that  the  program  would  have  when  it  is  modified,  is  an  important  one. 
Measures  for  estimating  the  stability  of  a  program  and  the  modules  of  which  the  program  is  composed  are 
presented,  and  an  algorithm  for  computing  these  stability  measures  is  given.  Application  of  these  measures  dur¬ 
ing  the  maintenance  phase  is  discussed  along  with  an  example.  Further  research  efforts  involving  validation  of 
the  stability  measures,  application  of  these  measures  during  the  design  phase,  and  restructuring  based  on  these 
measures  are  also  discussed. 

[Yau80]  Abstract:  A  control  flow  checking  scheme  capable  of  detecting  control  flow  errors  of  programs  result¬ 
ing  from  software  coding  errors,  hardware  malfunctions,  or  memory  mutilation  during  the  execution  of  the  pro¬ 
gram  is  presented.  In  this  approach,  the  program  is  partitioned  into  loop-free  intervals  and  a  database  containing 
the  path  information  in  each  of  the  loop-free  intervals  is  derived  from  the  detailed  design.  The  path  in  each  loop- 
free  interval  actually  traversed  at  run  time  is  recorded  and  then  checked  against  the  information  provided  in  the 
database,  and  any  discrepancy  indicates  an  error.  This  approach  is  general,  but  can  detect  all  uncompensated 
illegal  branches.  Any  uncompensated  error  that  occurs  during  the  execution  of  a  loop-free  interval  and  manifests 
itself  as  a  wrong  branch  within  the  loop-free  interval  is  also  detectable.  The  approach  can  also  be  used  to  check 
the  control  flow  in  the  testing  phase  of  program  development.  The  capabilities,  limitations,  implementation,  and 
the  overhead  of  using  this  approach  are  discussed. 

[Yeh77]  Abbreviated  Preface:  Software  validation  involves  analyzing  software  to  determine  the  extent  to  which 
it  performs  the  logical  functions  intended  by  its  creator.  Techniques  in  software  validation  can  be  classified 
broadly  into  two  categories:  testing  and  verification.  In  this  volume,  the  first  five  chapters  are  concerned  with 
testing  techniques  and  tools,  and  the  remaining  chapters  are  concerned  with  verification  techniques. 

In  the  first  chapter,  Henderson  argues  that  testing  should  be  a  constructive  activity  and  should  be  planned 
during  the  developmental  phase  of  a  program.  In  the  second  chapter,  Huang  gives  a  tutorial  discussion  of  a 
specific  technique  for  testing.  In  Chapter  3,  Goodenough  and  Gerhart  make  a  first  attempt  to  develop  a  theory 
for  software  testing.  In  Chapter  4,  Stucki  presents  a  specific  set  of  software  tools  as  an  aid  for  software  testing. 
In  Chapter  5,  Ramamoorthy  and  Ho  present  a  comprehensive  survey  of  automated  tools  for  testing  large 
software.  Operational  experiences  of  several  major  systems  and  their  limitations  are  also  discussed  in  this 
chapter. 

There  are  two  ways  of  approaching  program  verification.  The  static  approach  considers  a  program  and  its 
specifications  to  be  given.  Mathematical  proofs  are  developed  to  demonstrate  that  the  logical  behavior  of  a  pro¬ 
gram  is  as  specified,  viewing  this  logical  behavior  as  completely  characterized  by  a  set  of  formal  assertions.  The 
constructive  approach  lays  stress  on  the  correct  development  of  a  program.  The  remaining  five  chapters  survey 
various  techniques  in  the  static  approach. 

In  Chapter  6,  London  discussed  the  role  of  software  verification  and  gives  a  tutorial  introduction  to  the 
“inductive  assertion”  proof  technique.  In  Chapter  7,  Robinson  and  Levitt  extend  the  inductive  assertion  tech¬ 
nique  to  verify  hierarchically  structured  programs.  In  Chapter  8,  Morris  and  Wegbreit  present  another  proof 
technique  called  subgoal  induction.  In  Chapter  9,  Yeh  gives  yet  another  proof  technique  which  differs  from 
inductive  assertion  and  subgoal  induction  in  [providing  a]  proof  of  total  correctness.  In  Chapter  10,  Katz  and 
Manna  survey  existing  techniques  for  proving  that  programs  terminate. 

Finally,  Ann  Marmor-Squires’  selected  annotated  bibliography  provides  an  easy  guide  to  literature  in  pro¬ 
gram  validation. 

[Yeh79]  Introduction:  Maury  Halstead  had  a  dream!  By  treating  computer  programs  as  neither  art  forms  nor  as 
examples  of  mathematical  logic,  but  instead  as  basic  material  which  can  be  investigated  with  the  classical 
methods  of  experimental  and  theoretical  natural  science ,  Maury  had  dreamed  of  and  worked  hard  toward  a  uni¬ 
fied  and  coherent  new  field  he  called  Software  Science,  in  which  attributes  of  a  computer  program,  such  as 
implementation  efforts,  clarity,  structure,  error  rates,  language  levels,  etc.,  can  be  derived  from  its  basic  metrics 
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through  quantitative  hypotheses. 

The  special  collection  of  papers  on  Software  Science  not  only  contains  some  of  Maury’s  final  contribu¬ 
tions  to  the  field  he  started,  but  its  diversity  and  sophistication  is  an  assurance  that  Maury’s  dream  will  be  carried 
on. 


[Yln78]  Abstract:  It  has  been  recognized  that  success  in  producing  designs  that  realize  reliable  software,  even 
using  Structured  Design,  is  intimately  dependent  on  the  experience  level  of  the  designer.  The  gap  in  this  metho¬ 
dology  is  the  absence  of  easily  applied  quantitative  measures  of  quality  that  ease  the  dependence  of  reliable  sys¬ 
tems  on  the  rare  availability  of  expert  designers. 

Several  metrics  have  been  devised  which,  when  applied  to  design  structure  charts,  can  pinpoint  sections 
of  a  design  that  may  cause  problems  during  coding,  debugging,  integration,  and  modification.  These  metrics  can 
help  provide  an  independent,  unbiased  evaluation  of  design  quality.  These  metrics  have  been  validated  against 
program  error  data  of  two  recently  completed  software  projects  at  Hughes,  The  results  indicate  that  the  metrics 
can  provide  a  predictive  measure  of  program  errors  experienced  during  program  development. 

Guidelines  for  interpreting  the  design  metric  values  are  summarized  and  a  brief  description  of  an  interac¬ 
tive  structure  chart  graphics  system  to  simplify  metric  value  calculation  is  presented. 

[Yin80]  Abstract:  A  software  design  and  testability  analysis  system  has  been  developed  at  Hughes  Aircraft  Com¬ 
pany  to  measure  the  software  quality  in  terms  of  reliability,  maintainability,  and  testability.  Based  on  software 
design  structure  charts,  the  system  indicates  the  error-prone  and  difficult-to-test  areas  of  software  design  by 
quantitatively  measuring  the  program  complexity  and  testability.  Also  the  system  produces  several  testing  aids 
which  facilitate  integration.  The  results  have  been  successfully  validated  against  several  software  projects  at 
Hughes-Fullerton . 

The  system  is  configured  for  the  AMDAHL/470  and  is  accessible  via  a  HP2648A  graphics  terminal.  It 
allows  the  designers  to  interactively  create  and  edit  the  design,  and  automatically  produces  structure  charts  for 
documentation,  metrics  for  quality  measurements,  and  test  plans  for  integration. 

[Youn86a]  Abstract:  Static  concurrency  analysis  detects  anomalous  synchronization  patterns  in  concurrent  pro¬ 
grams,  but  may  also  report  spurious  errors  involving  unfeasible  execution  paths.  Integrated  applications  of  static 
concurrency  analysis  and  symbolic  execution  sharpens  the  results  of  the  former  without  incurring  the  full  costs  of 
the  latter  applied  in  isolation.  Concurrency  analysis  acts  as  a  path  selection  mechanism  for  symbolic  execution, 
while  symbolic  execution  acts  as  a  priming  mechanism  for  concurrency  analysis.  Methods  for  combining  the 
techniques  follow  naturally  from  explicit  characterization  and  comparison  of  the  state  spaces  explored  by  each, 
suggesting  a  general  approach  for  integrating  state-based  program  analysis  techniques  in  a  software  development 
environment. 

[Youn88a]  Abstract:  Analyses  based  on  state-space  models  of  execution  must  omit  some  details  of  execution,  in 
order  to  fold  the  infinite  space  of  possible  program  executions  into  a  sufficiently  small  space  for  analysis.  These 
simplifications  are  generally  justified  by  a  claim  that  the  resulting  analysis  is  conservative  with  respect  to  a  certain 
class  of  faults,  i.e.,  that  the  simplification  will  not  cause  any  faults  to  be  overlooked  in  the  analysis.  We  formalize 
a  notion  of  error-preserving  abstractions  which  captures  this  claim,  give  sufficient  conditions  for  verifying  this 
property  in  practical  cases,  and  discuss  the  role  of  error-preserving  abstractions  in  combining  fault  detection 
techniques. 

[Youn89a]  Abbreviated  Preface:  The  purpose  of  IDA  Paper  P-2132,  SDS  Software  Testing  and  Evaluation:  A 
Review  of  the  State-of-the-Art  in  Software  Testing  and  Evaluation  With  Recommended  R&D  Tasks,  is  to  identify 
the  technology  required  for  effective  and  efficient  testing  and  evaluation  of  Strategic  Defense  System  (SDS) 
software.  This  document  was  prepared  for  the  Strategic  Defense  Initiative  Organization  (SDIO),  and  provides 
an  overview  of  current  testing  and  evaluation  technology,  a  mapping  of  available  technology  against  SDS  needs, 
and  recommendations  to  close  critical  gaps  in  technology. 
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[Yoim89b]  Abbreviated  Preface:  The  purpose  of  IDA  Memorandum  M-496,  Bibliography  of  Testing  and 
Evaluation  Reference  Material,  is  to  present  the  reference  material  acquired  in  the  course  of  developing  IDA 
Paper  P-2132,  SDS  Testing  and  Evaluation:  A  Review  of  the  State-of-the-Art  in  Software  Testing  and  Evaluation 
With  Recommended  R&D  Tasks.  This  document  was  prepared  for  the  Strategic  Defense  Initiative  Organization 
(SDIO). 

[Youn89c]  Abstract:  The  conventional  classification  of  software  fault  detection  techniques  by  their  operational 
characteristics  (static  vs.  dynamic  analysis)  is  inadequate  as  a  basis  for  identifying  useful  relationships  between 
techniques.  A  more  useful  distinction  is  between  which  sample  the  space  of  possible  executions,  and  techniques 
which  fold  the  space.  The  new  distinction  provides  better  insight  into  the  ways  different  techniques  can  interact, 
and  is  a  necessary  basis  for  considering  hybrid  fault  detection  techniques. 

[Your76]  Abbreviated  Preface:  In  the  past  few  years,  the  programming  industry  has  been  revolutionized  by  a 
number  of  new  philosophies  and  techniques.  One  of  the  most  popular  of  these  techniques,  structured  program¬ 
ming,  has  led  to  order-of-magnitude  improvements  in  programmer  productivity,  program  reliability,  and  program 
maintenance  costs. 

More  recently,  though,  there  has  been  a  recognition  that  perfectly  structured  GOTO-less  code  is  essen¬ 
tially  worthless  if  the  basic  design  of  the  program  or  system  is  unsound. 

Our  concern  is  with  the  overall  architecture  of  programs  and  systems.  How  should  a  large  system  be  bro¬ 
ken  into  modules?  Which  modules?  Which  ones  should  be  subordinate  to  which?  How  do  we  know  when  we 
have  a  “good”  module  and,  more  important,  how  do  we  know  when  we  have  a  “bad”  module?  What  information 
should  be  passed  between  modules?  Should  a  module  have  the  opportunity  to  access  data  other  than  that  which 
it  needs  to  know  in  order  to  accomplish  its  task?  How  should  the  modules  be  “packaged”  together  into  efficient 
executable  units  in  a  typical  computer? 

Naturally,  the  answers  to  these  questions  are  influenced  by  the  specific  details  of  hardware,  operating  sys¬ 
tem,  and  programming  language  -  as  well  as  the  designer’s  interest  in  such  things  as  efficiency,  simplicity,  main¬ 
tainability,  and  reliability.  Nevertheless,  we  argue  that  questions  such  as  the  ones  posed  above  are  of  a  higher 
level  than  the  detailed  coding  questions  of  “Should  I  use  a  GOTO  here?”  or  “How  can  I  write  a  nested  IF  state¬ 
ment  to  accomplish  this  editing  logic?” 

[Yu88a]  Abstract:  This  paper  presents  the  data  and  capabilities  provided  by  the  Software  Metrics  Data  Collec¬ 
tion  (SMDC)  system.  SMDC  is  an  APL-based  system  that  runs  on  the  UNIX  4.3BSD  system  at  Purdue  Univer¬ 
sity.  The  largest  software  product  in  SMDC  has  more  than  1,000,000  lines  of  code.  SMDC  also  provides  a 
number  of  statistical  functions  and  plotting  routines  that  can  be  used  for  detailed  analysis  of  existing  data.  The 
data  and  tools  in  SMDC  are  available  for  use  by  non-Purdue  researchers  with  some  limitations. 

[Yu88b]  Abstract:  This  paper  presents  the  results  of  analyzing  several  defect  models  using  data  collected  from 
two  large  commercial  projects.  Traditional  models  typically  use  either  program  metrics  (i.e.,  measurements  from 
software  products)  or  testing  time  or  combinations  of  these  as  independent  variables.  The  limitations  of  such 
models  have  been  well-documented.  For  example,  program  metrics  are  difficult  to  compute  for  those  products 
that  consist  of  code  modified  from  previous  versions.  Another  example  is  that  testing  time  is  not  available  at  the 
beginning  of  the  testing  phase.  The  models  considered  in  this  paper  all  use  the  number  of  defects  detected  in  the 
earlier  phases  of  the  development  process  as  the  independent  variable.  This  number  can  be  used  to  predict  the 
number  of  defects  to  detected  later,  even  in  modified  software  products.  We  have  found  a  very  strong  correlation 
between  the  number  of  earlier  defects  and  that  of  later  ones.  Using  this  relationship,  we  constructed  a  mathemat¬ 
ical  model  which  may  be  used  to  estimate  the  number  of  defects  remaining  in  software.  This  defect  model  may 
also  be  used  to  guide  software  developers  in  evaluating  the  effectiveness  of  the  software  development  and  testing 
processes. 

[ZaflSO]  Abstract:  The  production  of  error-free  protocols  or  complex  process  interactions  is  essential  to  reliable 
communications.  This  paper  presents  techniques  for  both  the  detection  of  errors  in  protocols  and  for  prevention 
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of  errors  in  their  design.  The  methods  have  been  used  successfully  to  detect  and  correct  errors  in  existing  proto¬ 
cols.  A  technique  based  on  a  reachability  analysis  is  described  which  detects  errors  in  a  design.  This  “perturba¬ 
tion  technique”  has  been  implemented  and  has  successfully  detected  inconsistencies  or  errors  in  existing  proto¬ 
col  designs  including  both  X.21  and  X.25.  The  types  of  errors  handled  are  state  deadlocks,  unspecified  recep¬ 
tions,  nonexecutable  interactions,  and  state  ambiguities.  These  errors  are  discussed  and  their  effects  considered. 
An  interactive  design  technique  is  then  described  that  prevents  design  errors.  The  technique  is  based  on  a  set  of 
production  rules  which  guarantee  that  complete  reception  capability  is  provided  in  the  interacting  processes. 
These  rules  have  been  implemented  in  the  form  of  a  tracking  algorithm  that  prevents  a  designer  from  creating 
unspecified  receptions  and  nonexecutable  interactions  and  monitors  for  the  presence  of  state  deadlocks  and 
ambiguities. 

[ZelMlb]  Abstract:  Many  testing  methods  require  the  selection  of  a  set  of  paths  over  which  testing  is  to  be  con¬ 
ducted.  This  paper  presents  an  analysis  of  the  effectiveness  of  individual  paths  for  testing  predicates  in  linearly 
domained  programs.  A  measure  is  derived  for  the  marginal  advantage  of  testing  another  path  after  several  paths 
have  already  been  tested.  This  measure  is  used  to  show  that  any  predicate  in  such  programs  may  be  sufficiently 
tested  using  at  most  m+n+1  paths,  where  m  is  the  number  of  input  values  and  n  is  the  number  of  program  vari¬ 
ables. 

[Zeil83a]  Abstract:  Many  testing  methods  require  the  selection  of  a  set  of  paths  on  which  tests  are  to  be  con¬ 
ducted.  Errors  in  arithmetic  expressions  within  program  statements  can  be  represented  as  perturbing  functions 
added  to  the  correct  expression.  It  is  then  possible  to  derive  the  set  of  errors  in  a  chosen  functional  class  which 
cannot  possibly  be  detected  using  a  given  test  path.  For  example,  test  paths  which  pass  through  an  assignment 
statement  “X:-  f(Y)”  are  incapable  of  revealing  if  the  expression  “X  -  f(Y)“  has  been  added  to  later  statements. 
In  general,  there  are  an  infinite  number  of  such  undetectable  error  perturbations  for  any  test  path.  However, 
when  the  chosen  functional  class  of  error  expressions  is  a  vector  space,  a  finite  characterization  of  all  undetect¬ 
able  expressions  can  be  found  for  one  test  path,  or  for  combined  testing  along  several  paths.  An  analysis  of  the 
undetected  perturbations  for  sequential  programs  operating  on  integers  and  real  numbers  is  presented  which  per¬ 
mits  the  detection  of  multinomial  error  terms.  The  reduction  of  the  space  of  (potential)  undetected  errors  is  pro¬ 
posed  as  a  criterion  for  test  path  selection. 

[ZeU84]  Abstract:  The  use  of  algebraic  techniques  in  defining  a  neighborhood  of  functions  is  particularly  suited 
to  testing  for  computation  errors.  Two  possible  approaches  are  Howden’s  algebraic  testing  method  and  perturba¬ 
tion  testing,  which  in  this  paper  is  generalized  to  permit  analysis  of  individual  test  points  rather  than  entire  paths. 
These  approaches  are  shown  to  be  mathematically  equivalent  when  applied  to  a  program’s  black-box  output.  Per¬ 
turbation  testing,  however,  offers  more  flexibility  in  the  choice  of  potential  errors  to  be  investigated.  A  significant 
alternative  offered  by  perturbation  testing  is  the  ability  to  work  in  the  static  domain,  choosing  test  data  to  elim¬ 
inate  possible  error  terms  in  specific  assignment  and  output  statements. 

[ZeU86]  Abstract:  This  paper  introduces  a  new  testing  strategy,  EQUATE  testing.  EQUATE  represents  an 
attempt  to  merge  the  strengths  of  perturbation  testing  and  mutation  testing  in  order  to  provide  a  testing  strategy 
that  offers  support  for  data  and  functional  abstraction,  that  detects  a  wide  variety  of  simple  faults,  and  that  also 
provides  good  coverage  of  combinations  of  those  simple  faults.  EQUATE  selects  a  number  of  test  locations 
throughout  the  program  and  chooses  a  set  of  expressions  derived  from  the  abstract  syntax  tree  of  the  modules 
being  tested.  Test  data  is  required  that  distinguishes  each  pair  of  these  expressions  from  one  another  at  every  test 
location. 

[ZeU87]  Abstract:  The  philosophy  of  composing  new  software  tools  from  previously  created  tool  fragments  can 
facilitate  the  development  software  systems.  An  examination  is  made  of  the  extension  of  this  philosophy  to  the 
design  of  program  interpreters,  demonstrating  how  the  separation  of  interpretation  into  a  core  algorithm,  value- 
kind  definitions,  and  computation  model  allows  the  capture  of  conventional  execution  models,  symbolic  execu¬ 
tion  models,  dynamic  dataflow  tracking,  and  other  useful  forms  of  program  interpretation.  An  interpretation 
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system  based  on  this  separation,  called  ARIES,  is  currently  under  development. 

[Zeil88a]  Abstract:  A  given  path  selection  criterion  is  more  selective  than  another  such  criterion  with  respect  to 
some  testing  goal  if  it  never  requires  more,  and  sometimes  requires  fewer,  test  paths  to  achieve  that  goal.  This 
paper  presents  canonical  forms  of  control-flow  and  data-flow  path  selection  criteria  and  demonstrates  that,  for 
some  simple  testing  goals,  the  data-flow  criteria  as  a  general  class  are  more  selective  than  the  control-flow  cri¬ 
teria.  It  is  shown,  however,  that  this  result  does  not  hold  for  general  testing  goals,  a  limitation  that  appears  to 
stem  directly  from  the  practice  of  defining  data-flow  criteria  upon  the  computation  history  contributing  to  a  sin¬ 
gle  result. 

[Zeil88b]  Abstract:  Most  testing  methods  do  not  fare  well  with  software  whose  modules  contain  data  and  opera¬ 
tions  at  widely  varying  levels  of  abstraction.  With  the  evolution  of  new  design  techniques  and  new  languages  that 
encourage  the  use  of  more  abstract  data  types,  it  is  becoming  increasingly  important  that  testing  methods  begin 
to  deal  with  abstraction  in  a  reasonable  and  consistent  manner.  The  EQUATE  testing  strategy  offers  strong  sup¬ 
port  for  consistent  manner.  The  EQUATE  selects  a  number  of  test  locations  throughout  the  program  and 
chooses  a  set  of  expressions  derived  from  the  abstract  syntax  tree  of  the  module  being  tested.  Test  data  is 
required  that  distinguishes  these  expressions  from  one  another  at  every  test  location.  The  time  complexity  of 
EQUATE  is  at  worst  0(Lp)  and  the  space  complexity  0(Lp3)  where  Lp  is  the  length  of  the  program  under  test. 

[Zeil88c]  Abstract:  Despite  the  important  role  played  by  the  notion  of  abstraction  in  modem  methods  of 
software  design  and  implementation,  relatively  little  consideration  has  been  paid  to  the  interaction  between 
abstraction  and  testing  criteria,  especially  for  automatable  criteria.  A  survey  of  relevant  syntactic,  semantic,  and 
methodological  problems  is  presented,  and  a  brief  overview  is  presented  of  research  by  the  author  aimed  at 
developing  testing  criteria  free  of  those  problems. 

[Zeil89]  Abstract:  Perturbation  testing  is  an  approach  to  software  testing  which  focuses  on  faults  within  arith¬ 
metic  expressions  appearing  throughout  a  program.  In  this  paper  perturbation  testing  is  expanded  to  permit 
analysis  of  individual  test  points  rather  than  entire  paths,  and  to  concentrate  on  domain  errors.  Faults  are 
modeled  as  perturbing  functions  drawn  from  a  vector  space  of  potential  faults  and  added  to  the  correct  form  of 
an  arithmetic  expression.  Sensitivity  measures  are  derived  which  limit  the  possible  size  of  those  faults  that  would 
go  undetected  after  the  execution  of  a  given  test  set.  These  measures  open  up  an  interesting  new  view  of  testing, 
in  which  attempts  are  made  to  reduce  the  volume  of  possible  faults  which,  were  they  present  in  the  program 
being  tested,  would  have  escaped  detection  on  all  tests  performed  so  far.  The  combination  of  tuese  new  meas¬ 
ures  with  standard  optimization  techniques  yields  a  new  test  data  generation  method,  called  arithmetic  fault 
detection. 

[ZeUc78]  Abstract:  Software  engineering  refers  to  the  process  of  creating  software  systems.  It  applies  loosely  to 
techniques  which  reduce  high  software  cost  and  complexity  while  increasing  reliability  and  modifiability.  This 
paper  outlines  the  procedures  used  in  the  development  of  computer  software,  emphasizing  large-scale  software 
development,  and  pinpointing  areas  where  problems  exist  and  solutions  have  been  proposed.  Solutions  from 
both  the  management  and  the  programmer  points  of  view  are  then  given  for  many  of  these  problem  areas. 

[Zoln81]  Abstract:  Program  complexity  is  a  topic  often  discussed  in  the  literature.  Research  is  ongoing  in  verify¬ 
ing  existing  complexity  measures.  There  is  also  a  continuing  effort  to  produce  and  validate  new  approaches  to  a 
complexity  measure  which  incorporate  ideas  from  a  variety  of  areas. 

Too  often,  however,  approaches  to  complexity  measurement  center  on  a  particular  aspect  of  a  program, 
e.g.,  structures,  without  incorporating  other  relevant  program  characteristics.  The  question  to  be  answered, 
then,  is,  What  aspects  of  a  program  contribute  to  its  complexity? 

This  paper  presents  a  first  step  in  answering  this  question.  Preliminary  results  are  presented  from  a  Delphi 
Survey  on  program  complexity.  The  survey  was  sent  to  a  cross-section  of  programmers,  managers  and  software 
experts.  Respondents  rated  a  large  number  of  characteristics  as  to  their  effect  on  program  complexity.  The  paper 
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summarizes  the  results  and  includes  preliminary  analyses. 

[Zweb79]  Abstract:  During  the  past  few  years,  several  investigators  have  noted  definite  patterns  in  the  distribu¬ 
tion  of  operators  in  computer  programs.  Their  proposed  models  have  provided  explanations  for  other  observed 
software  phenomena  and  have  suggested  possible  relationships  between  programming  languages  and  natural 
languages.  However,  these  models  contain  notable  deficiencies. 

This  study  concentrates  on  a  set  of  production  programs  written  in  PL/I.  Using  some  basic  relationships 
from  software  science,  and  a  previously  published  algorithm  generation  technique,  a  model  for  computing  opera¬ 
tor  frequencies  is  constructed  which  is  based  only  on  the  number  of  distinct  operators  in  the  program  and  the 
total  number  of  operator  occurrences.  The  model  provides  a  considerable  statistical  improvement  over  existing 
models  for  the  PL/I  programs  studied. 

[ZwebS9]  Abstract:  Weyuker  has  recently  proposed  a  set  of  properties  which  should  be  satisfied  by  any  reason¬ 
able  criterion  used  to  claim  that  a  computer  program  has  been  adequately  tested.  She  called  these  properties 
“axioms.”  She  also  evaluated  several  well-known  testing  strategies  with  respect  to  these  properties,  and  con¬ 
cluded  that  some  of  the  commonly  used  strategies  failed  to  satisfy  several  of  the  properties. 

We  question  both  the  fundamental  nature  of  the  properties  and  the  precision  with  which  they  are 
presented,  and  illustrate  how  a  number  of  ideas  in  Weyuker’s  paper  can  be  simplified  and  clarified  through 
greater  precision  and  a  more  consistent  set  of  definitions.  We  also  reanalyze  the  testing  strategies  after  account¬ 
ing  for  these  inconsistencies.  The  strategies  tend  to  fare  much  better  as  a  result  of  this  reanalysis. 

[vanH68]  Abstract:  The  designer  of  a  computing  system  should  adopt  explicit  criteria  for  accepting  or  rejecting 
proposed  system  features.  Three  possible  criteria  of  this  kind  are  input  recordability,  input  specifiability,  and 
asynchronous  reproducibility  of  output.  These  criteria  imply  that  a  user  can,  if  he  desires,  either  know  or  control 
all  the  influences  affecting  the  content  and  extent  of  his  computer’s  output.  To  define  the  scope  of  the  criteria, 
the  notion  of  an  abstract  machine  of  a  programming  language  and  the  notion  of  a  virtual  computer  are  explained. 
Examples  of  applications  of  the  criteria  concern  the  reading  of  a  time-of-day  clock,  the  synchronization  of  paral¬ 
lel  processes,  protection  ii.  multiprogrammed  systems,  and  the  assignment  of  capability  indexes. 

[vonH85]  Ada  packages  are  the  basic  building  blocks  of  Ada  programs.  The  separation  in  Ada  into  package  visi¬ 
ble  part  and  body  is  intended  to  support  a  programming  style  that  employs  modularization,  encapsulation  and 
information  hiding.  Unfortunately,  the  visible  part  provides  only  the  syntactic  interface  to  the  package;  it  does 
not  convey  any  information  about  the  meaning  of,  e.g.,  visible  subprograms.  Instead,  when  the  user  of  an  Ada 
package  wants  to  understand  what  services  it  provides  he  needs  to  study  the  package  body.  Thus,  the  purpose  of 
the  separation  into  visible  part  and  body  is  somewhat  subverted  if  the  body  is  the  only  place  where  semantic 
information  can  be  found. 
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